á 


Pseudorandom Generators 


7.1 Motivation and Definition 


In the previous sections, we have seen a number of interesting 
derandomization results: 


e Derandomizing specific algorithms, such as the ones for 
MaxCutT and Undirected S-T Connectivity; 

e Giving explicit (efficient, deterministic) constructions of 
various pseudorandom objects, such as expanders, extrac- 
tors, and list-decodable codes, as well as showing various 
relations between them; 

e Reducing the randomness needed for certain tasks, such 
as sampling and amplifying the success probability of 
randomized algorithm; and 

e Simulating BPP with any weak random source. 


However, all of these still fall short of answering our original 
motivating question, of whether every randomized algorithm can be 
efficiently derandomized. That is, does BPP = P? 

As we have seen, one way to resolve this question in the positive is 
to use the following two-step process: First show that the number of 
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random bits for any BPP algorithm can be reduced from poly(n) to 
O(logn), and then eliminate the randomness entirely by enumeration. 

Thus, we would like to have a function G that stretches a seed 
of d= O(logn) truly random bits into m = poly(n) bits that “look 
random.” Such a function is called a pseudorandom generator. The 
question is how we can formalize the requirement that the output 
should “look random” in such a way that (a) the output can be used 
in place of the truly random bits in any BPP algorithm, and (b) such 
a generator exists. 

Some candidate definitions for what it means for the random 
variable X = G(Uq) to “look random” include the following: 


e Information-theoretic or statistical measures: For example, 
we might measure entropy of G(Uzq), its statistical difference 
from the uniform distribution, or require pairwise indepen- 
dence. All of these fail one of the two criteria. For example, 
it is impossible for a deterministic function to increase 
entropy from O(logn) to poly(n). And it is easy to construct 
algorithms that fail when run using random bits that are 
only guaranteed to be pairwise independent. 

e Kolmogorov complexity: A string x “looks random” if it is 
incompressible (cannot be generated by a Turing machine 
with a description of length less than |z|). An appealing 
aspect of this notion is that it makes sense of the randomness 
in a fixed string (rather than a distribution). Unfortunately, 
it is not suitable for our purposes. Specifically, if the function 
G is computable (which we certainly want) then all of its 
outputs have Kolmogorov complexity d= O(logn) (just 
hardwire the seed into the TM computing G), and hence are 
very compressible. 

e Computational indistinguishability: This is the measure we 
will use. Intuitively, we say that a random variable X “looks 
random” if no efficient algorithm can distinguish X from a 
truly uniform random variable. Another perspective comes 
from the definition of statistical difference: 


A(X, Y) = max| Pr|X € T] — Pr[Y € T]]. 
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With computational indistinguishability, we simply restrict 
the max to be taken only over “efficient” statistical tests T — 
that is, Ts for which membership can be efficiently tested. 


7.1.1 Computational Indistinguishability 


Definition 7.1 (computational indistinguishability). Random 
variables X and Y taking values in {0,1} are (t,¢) indistinguishable 
if for every nonuniform algorithm T running in time at most t, we have 


|Pr[T(X) = 1] — Pr[T(Y) = 1]| < e. 
The left-hand side above is called also the advantage of T. 


Recall that a nonuniform algorithm is an algorithm that may have 
some nonuniform advice hardwired in. (See Definition 3.10.) If the 
algorithm runs in time t we require that the advice string is of length at 
most t. Typically, to make sense of complexity measures like running 
time, it is necessary to use asymptotic notions, because a Turing 
machine can encode a huge lookup table for inputs of any bounded size 
in its transition function. However, for nonuniform algorithms, we can 
avoid doing so by using Boolean circuits as our nonuniform model of 
computation. Similarly to Fact 3.11, every nonuniform Turing machine 
algorithm running in time t(n) can be simulated by a sequence of 
Boolean circuit C, of size O(t(n)) and conversely every sequence of 
Boolean circuits of size s(n) can be simulated by a nonuniform Turing 
machine running in time O(s(n)). Thus, to make our notation cleaner, 
from now on, by “nonuniform algorithm running in time t,” we mean 
“Boolean circuit of size t,” where we measure the size by the number 
of AND and OR gates in the circuit. (For convenience, we don’t 
count the inputs and negations in the circuit size.) Note also that in 
Definition 7.1 we have not specified whether the distinguisher is deter- 
ministic or randomized; this is because a probabilistic distinguisher 
achieving advantage greater than € can be turned into a deterministic 
distinguisher achieving advantage greater than € by nonuniformly 
fixing the randomness. (This is another example of how “nonuniformity 
is more powerful than randomness,” like in Corollary 3.12.) 
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It is also of interest to study computational indistinguishability 
and pseudorandomness against uniform algorithms. 


Definition 7.2 (uniform computational indistinguishability). 
Let Xm,Ym be some sequences of random variables on {0,1} (or 
{0,1}P°¥(™), For functions t: NN and e: N > [0,1], we say that 
{Xm} and {Ym} are (t(m),e(m)) indistinguishable for uniform algo- 
rithms if for all probabilistic algorithms T running in time t(m), we 
have that 


|Pr[P(Xm) = 1] — Pr[L(Yin) = 1]| < e(m) 


for all sufficiently large m, where the probabilities are taken over Xm, 
Ym and the random coin tosses of T. 


We will focus on the nonuniform definition in this survey, but will 
mention results about the uniform definition as well. 


7.1.2 Pseudorandom Generators 


Definition 7.3. A deterministic function G: {0,1}4— {0,1}™ is a 
(t,£) pseudorandom generator (PRG) if 


(1) d< m, and 
(2) G(Ug) and Um are (t,£) indistinguishable. 


Also, note that we have formulated the definition with respect 
to nonuniform computational indistinguishability, but an analogous 
uniform definition can be given. 

People attempted to construct pseudorandom generators long 
before this definition was formulated. Their generators were tested 
against a battery of statistical tests (e.g., the number of 1s and Os are 
approximately the same, the longest run is of length O(logm), etc.), 
but these fixed set of tests provided no guarantee that the generators 
will perform well in an arbitrary application (e.g., in cryptography 
or derandomization). Indeed, most classical constructions (e.g., linear 
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congruential generators, as implemented in the standard C library) 
are known to fail in some applications. 

Intuitively, the above definition guarantees that the pseudorandom 
bits produced by the generator are as good as truly random bits for all 
efficient purposes (where efficient means time at most t). In particular, 
we can use such a generator for derandomizing any algorithm of 
running time at most t. For the derandomization to be efficient, we 
will also need the generator to be efficiently computable. 


Definition 7.4. We say a sequence of generators {Gm : {0,1}4”) > 
{0,1}'"} is computable in time t(m) if there is a uniform and deter- 
ministic algorithm M such that for every m € N and z € LO, Tie). 
we have M(m,x) = G(x) and M(m,zx) runs in time at most t(m). In 
addition, M(m) (with no second input) should output the value d(m) 
in time at most t(m). 


Note that even when we define the pseudorandomness property of 
the generator with respect to nonuniform algorithms, the efficiency 
requirement refers to uniform algorithms. As usual, for readability, we 
will usually refer to a single generator G : {0,1}4™ — {0,1}, with it 
being implicit that we are really discussing a family {Gm}. 


Theorem 7.5. Suppose that for all m there exists an (m,1/8) 
pseudorandom generator G : {0,1}4”) — {0,1} computable in time 
t(m). Then BPP C U, DTIME (2%) . (n° + t(n°))). 


Proof. Let A(x;r) be a BPP algorithm that on inputs x of length n, 
can be simulated by Boolean circuits of size at most most n°, using coin 
tosses r. Without loss of generality, we may assume that |r| = n°. (It will 
often be notationally convenient to assume that the number of random 
bits used by an algorithm equals its running time or circuit size, so as 
to avoid an extra parameter. However, most interesting algorithms will 
only actually read and compute with a smaller number of these bits, so 
as to leave time available for computation. Thus, one should actually 
think of an algorithm as only reading a prefix of its random string r.) 
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The idea is to replace the random bits used by A with pseudoran- 
dom bits generated by G, use the pseudorandomness property to show 
that the algorithm will still be correct with high probability, and finally 
enumerate over all possible seeds to obtain a deterministic algorithm. 


Claim 7.6. For every x of length n, A(a;G(Uancy)) errs with 
probability smaller than 1/2. 


Proof of Claim: Suppose that there exists some x on which 
A(x;G(Uane))) errs with probability at least 1/2. Then T(-) = A(z,-) 
is a Boolean circuit of size at most n° that distinguishes G(U ancy) 
from U;« with advantage at least 1/2 — 1/3 > 1/8. (Notice that we are 
using the input x as nonuniform advice; this is why we need the PRG 
to be pseudorandom against nonuniform tests.) o 


Now, enumerate over all seeds of length d(n°) and take a majority 
vote. There are 2%") of them, and for each we have to run both G 
and A. q 


Notice that we can afford for the generator G to have running time 
t(m) = poly(m) or even t(m) = poly(m) - 20(4m)) without affecting 
the time of the derandomization by than more than a polynomial 
amount. In particular, for this application, it is OK if the generator 
runs in more time than the tests it fools (which are time m in this 
theorem). That is, for derandomization, it suffices to have G that is 
mildly explicit according to the following definition: 


Definition 7.7. 
(1) A generator G : {0,1}4") — {0,1} is mildly explicit if it is 
computable in time poly(m,2%™), 
(2) A generator G : {0,1} — {0,1} is fully explicit if it is 
computable in time poly(m). 


These definitions are analogous to the notions of mildly explicit 
and fully explicit for expander graphs in Section 4.3. The truth table 
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of a mildly explicit generator can be constructed in time polynomial in 
its size (which is of size m - 2a). whereas a fully explicit generator 
can be evaluated in time polynomial in its input and output lengths 
(like the neighbor function of a fully explicit expander). 

Theorem 7.5 provides a tradeoff between the seed length of the 
PRG and the efficiency of the derandomization. Let’s look at some 
typical settings of parameters to see how we might simulate BPP in 
the different deterministic time classes (see Definition 3.1): 


(1) Suppose that for every constant ¢ > 0, there is an (m,1/8) 
mildly explicit PRG with seed length d(m) =m*. Then 


BPP C()..)DTIME(2") = SUBEXP. Since it is 
known that SUBEXP is a proper subset of EXP, this is 
already a nontrivial improvement on the current inclusion 
BPP c EXP (Proposition 3.2). 

(2) Suppose that there is an (m,1/8) mildly explicit 


PRG with seed length d(m)=polylog(m). Then 


BPP C U.DTIME(208°m) © p, 


(3) Suppose that there is an (m,1/8) mildly explicit PRG with 
seed length d(m) = O(log m). Then BPP =P. 


Of course, all of these derandomizations are contingent on the 
question of whether PRGs exist. As usual, our first answer is yes but 
the proof is not very helpful — it is nonconstructive and thus does not 
provide for an efficiently computable PRG: 


Proposition 7.8. For all m € N and € > 0, there exists a (nonexplicit) 
(m,e) pseudorandom generator G : {0,1}4 — {0,1} with seed length 
d = O(logm + log(1/e)). 


Proof. The proof is by the probabilistic method. Choose 
G : {0,1}4 > {0,1}” at random. Now, fix a time m algorithm, T. 
The probability (over the choice of G) that T distinguishes G(U,) 
from Um with advantage € is at most Q-2(2%2") by a Chernoff 
bound. There are 2P°'¥(™ nonuniform algorithms running in time m 
(i.e., circuits of size m). Thus, union-bounding over all possible T, 
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we get that the probability that there exists a T breaking G 
is at most gpoly(m)9—O(2%e?) which is less than 1 for d_ being 
O(logm + log(1/e)). o 


Note that putting together Proposition 7.8 and Theorem 7.5 gives 
us another way to prove that BPP C P/poly (Corollary 3.12): just let 
the advice string be the truth table of an (n°,1/8) PRG (which can be 
described by 24 . n° = poly(n) bits), and then use that PRG in the 
proof of Theorem 7.5 to derandomize the BPP algorithm. However, if 
you unfold both this proof and our previous proof (where we do error 
reduction and then fix the coin tosses), you will see that both proofs 
amount to essentially the same “construction.” 


7.2 Cryptographic PRGs 


The theory of computational pseudorandomness discussed in this sec- 
tion emerged from cryptography, where researchers sought a defini- 
tion that would ensure that using pseudorandom bits instead of truly 
random bits (e.g., when encrypting a message) would retain security 
against all computationally feasible attacks. In this setting, the gener- 
ator G is used by the honest parties and thus should be very efficient 
to compute. On the other hand, the distinguisher T corresponds to an 
attack carried about by an adversary, and we want to protect against 
adversaries that may invest a lot of computational resources into trying 
to break the system. Thus, one is led to require that the pseudorandom 
generators be secure even against distinguishers with greater running 
time than the generator. The most common setting of parameters in the 
theoretical literature is that the generator should run in a fixed polyno- 
mial time, but the adversary can run in an arbitrary polynomial time. 


Definition 7.9. A generator Gm : {0,1} — {0,1} is a crypto- 
graphic pseudorandom generator if: 


(1) Gm is fully explicit. That is, there is a constant b such that 


Gm is computable in time mè. 
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(2) Gm is an (m?,1/m?) PRG. That is, for every constant 
C, Gm is an (m°,1/m°) pseudorandom generator for all 
sufficiently large m. 


Due to space constraints and the fact that such generators are 
covered in other texts (see the Chapter Notes and References), we will 
not do an in-depth study of cryptographic generators, but just survey 
what is known about them. 

The first question to ask is whether such generators exist at all. 
It is not hard to show that cryptographic pseudorandom generators 
cannot exist unless P#NP, indeed unless NP ¢ P/poly. (See 
Problem 7.3.) Thus, we do not expect to establish the existence of such 
generators unconditionally, and instead need to make some complexity 
assumption. While it would be wonderful to show that NP ¢ P/poly 
implies the existence of cryptographic pseudorandom generators, that 
too seems out of reach. However, we can base them on the very 
plausible assumption that there are functions that are easy to evaluate 
but hard to invert. 


Definition 7.10. fn : {0,1}" — {0,1}” is a one-way function if: 


(1) There is a constant b such that fn is computable in time n? 


for sufficiently large n. 
(2) For every constant c and every nonuniform algorithm A 
running in time n°: 


Pr[A(fa(Un)) € fix (Fn Und] < È 


for all sufficiently large n. 


Assuming the existence of one-way functions seems stronger than 
the assumption NP ¢ P/poly. For example, it is an average-case 
complexity assumption, as it requires that f is hard to invert when 
evaluated on random inputs. Nevertheless, there are a number of 
candidate functions believed to be one-way. The simplest is integer 
multiplication: f,(x,y) =x- y, where x and y are n/2-bit numbers. 
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Inverting this function amounts to the integer factorization problem, 
for which no efficient algorithm is known. 

A classic and celebrated result in the foundations of cryptography 
is that cryptographic pseudorandom generators can be constructed 
from any one-way function: 


Theorem 7.11. The following are equivalent: 


(1) One-way functions exist. 

(2) There exist cryptographic pseudorandom generators with 
seed length d(m) =m — 1. 

(3) For every constant € > 0, there exist cryptographic pseudo- 


random generators with seed length d(m) = m*. 


Corollary 7.12. If one-way functions exist, then BPP C SUBEXP. 


What about getting a better derandomization? The proof of the 
above theorem is more general quantitatively. It takes any one-way 
function fe: {0,1} — {0,1} and a parameter m, and constructs a gen- 
erator Gm : {0,1}P°% — {0,1}™. The fact that Gm is pseudorandom 
is proven by a reduction as follows. Assuming for contradiction that we 
have an algorithm T that runs in time t and distinguishes Gm from uni- 
form with advantage £, we construct an algorithm T’ running in time 
t =t-(m/e)°™ inverting fe (say with probability 1/2). If t! < poly(2), 
then this contradicts the one-wayness of f, and hence we conclude 
that T cannot exist and Gm is a (t,£) pseudorandom generator. 

Quantitatively, if fọ is hard to invert by algorithms running in time 
s(@) and we take m = 1/e = s(¢)°™, then we have t < s(£) for every 
t = poly(m) and sufficiently large £. Thus, viewing the seed length d 
of Gm as a function of m, we have d(m) = poly(@) = poly(s~!(m*™)), 
where m““) denotes any superpolynomial function of m. 

Thus: 


e If s(¢) =), we can get seed length d(m)=m*® for any 
desired constant € > 0 and BPP C SUBEXP (as discussed 
above). 
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e If s(2) = 20 (as is plausible for the factoring one-way 
function), then we get seed length d(m) = poly(logm) and 
BPP CP. 


But we cannot get seed length d(m) = O(logm), as needed for con- 
cluding BPP = P, from this result. Even for the maximum possible 
hardness s(@) = 2°, we get d(m) = poly(logm). In fact, Problem 7.3 
shows that it is impossible to have a cryptographic PRG with seed 
length O(logm) meeting Definition 7.9, where we require that Gm 
be pseudorandom against all poly(m)-time algorithms. However, for 
derandomization we only need Gm to be pseudorandom against a fixed 
poly-time algorithm, e.g., running in time t = m, and we would get such 
generators with seed length O(log m) if the aforementioned construction 
could be improved to yield seed length d = O(£) instead of d = poly(@). 


Open Problem 7.13. Given a one-way function f : {0,1} — {0,1} 
that is hard to invert by algorithms running in time s= s() and a 
constant c, it is possible to construct a fully explicit (t,¢) pseudoran- 
dom generator G: {0,1}¢— {0,1} with seed length d= O(£) and 
pseudorandomness against time t = s - (e/m)°()? 


The best known seed length for such a generator is 
d= O(@ - log(m/e)/log”.s), which is O(€) for the case that s = 2° 
and m = 22 as discussed above. 

The above open problem has long been solved in the positive 
for one-way permutations f : {0,1}£ — {0,1}*. In fact, the construc- 
tion of pseudorandom generators from one-way permutations has a 
particularly simple description: 


Galen) =Car F@malf@)n ost 9 @)n)s 


where |r| = |x| =£ and (-,-) denotes inner product modulo 2. One 
intuition for this construction is the following. Consider the sequence 
(FD (Uy), fo") (U),..., (Ue), Ue). By the fact that f is hard to 
invert (but easy to evaluate) it can be argued that the i + 1’st compo- 
nent of this sequence is infeasible to predict from the first i components 
except with negligible probability. Thus, it is a computational analogue 
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of a block source. The pseudorandom generator then is obtained by 
a computational analogue of block-source extraction, using the strong 
extractor Ext(z,r) = (x,r). The fact that the extraction works in this 
computational setting, however, is much more delicate and complex 
to prove than in the setting of extractors, and relies on a “local 
list-decoding algorithm” for the corresponding code (namely the 
Hadamard code). See Problems 7.12 and 7.13. (We will discuss local 
list decoding in Section 7.6.) 


Pseudorandom Functions. It turns out that a cryptographic pseu- 
dorandom generator can be used to build an even more powerful 
object — a family of pseudorandom functions. This is a family of func- 
tions {fs : {0,1}4 > {0,1}}sefo,1}¢ such that (a) given the seed s, the 
function f, can be evaluated in polynomial time, but (b) without the 
seed, it is infeasible to distinguish an oracle for f, from an oracle to a 
truly random function. Thus in some sense, the d-bit truly random seed 
s is stretched to 24 pseudorandom bits (namely the truth table of f,)! 
Pseudorandom functions have applications in several domains: 


e Cryptography: When two parties share a seed s to a PRF, 
they effectively share a random function f : {0,1}4 — {0,1}. 
(By definition, the function they share is indistinguishable 
from random by any poly-time third party.) Thus, in order 
for one party to send a message m encrypted to the other, 
they can simply choose a random r & {0,1}¢, and send 
(r, f(r) m). With knowledge of s, decryption is easy; 
simply calculate f,(r) and XOR it to the second part of the 
received message. However, the value f;(r) @ m would look 
essentially random to anyone without knowledge of s. 

This is just one example; pseudorandom functions have vast 
applicability in cryptography. 

e Learning Theory: Here, PRFs are used mainly to prove neg- 
ative results. The basic paradigm in computational learning 
theory is that we are given a list of examples of a function’s 
behavior, (21, f(x2)), (£2, f(£2)),---, (£k, f(£k))), where the 
xis are being selected randomly from some underlying 
distribution, and we would like to predict what the func- 
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tion’s value will be on a new data point x,41 coming from 
the same distribution. Information-theoretically, correct 
prediction is possible after a small number of samples (with 
high probability), assuming that the function has a small 
description (e.g., is computable by a poly-sized circuit). 
However, it is computationally hard to predict the output 
of a PRF f, on a new point 2,4, after seeing its value on 
k points (and this holds even if the algorithm gets to make 
“membership queries” — choose the evaluation points on 
its own, in addition to getting random examples from some 
underlying distribution). Thus, PRFs provide examples of 
functions that are efficiently computable yet hard to learn. 
e Hardness of Proving Circuit Lower Bounds: One main 
approach to proving PNP is to show that some 
f E€ NP doesn’t have polynomial size circuits (equivalently, 
NP ¢ P/poly). This approach has had very limited suc- 
cess — the only superpolynomial lower bounds that have 
been achieved have been using very restricted classes of 
circuits (monotone circuits, constant depth circuits, etc). 
For general circuits, the best lower bound that has been 
achieved for a problem in NP is 5n — O(n). 
Pseudorandom functions have been used to help explain 
why existing lower-bound techniques have so far not yielded 
superpolynomial circuit lower bounds. Specifically, it has 
been shown that any sufficiently “constructive” proof of 
superpolynomial circuit lower bounds (one that would allow 
us to certify that a randomly chosen function has no small cir- 
cuits) could be used to distinguish a pseudorandom function 
from truly random in subexponential time and thus invert 
any one-way function in subexponential time. This is known 
as the “Natural Proofs” barrier to circuit lower bounds. 


7.3 Hybrid Arguments 


In this section, we introduce a very useful proof method for working 
with computational indistinguishability, known as the hybrid argument. 
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We use it to establish two important facts — that computational 
indistinguishability is preserved under taking multiple samples, and 
that pseudorandomness is equivalent to next-bit unpredictability. 


7.3.1 Indistinguishability of Multiple Samples 


The following proposition illustrates that computational indistin- 
guishability behaves like statistical difference when taking many 
independent repetitions; the distance ¢ multiplies by at most the num- 
ber of copies (cf. Lemma 6.3, Part 6). Proving it will introduce useful 
techniques for reasoning about computational indistinguishability, and 
will also illustrate how working with such computational notions can 
be more subtle than working with statistical notions. 


Proposition 7.14. If random variables X and Y are (t,£) indistin- 
guishable, then for every k € N, X* and Y* are (t, ke) indistinguishable 
(where X* represents k independent copies of X). 


Note that when t= oo, this follows from Lemma 6.3, Part 6; the 
challenge here is to show that the same holds even when we restrict to 
computationally bounded distinguishers. 


Proof. We will prove the contrapositive: if there is an efficient 
algorithm T distinguishing X* and Y* with advantage greater than 
ke, then there is an efficient algorithm T’ distinguishing X and Y 
with advantage greater than £. The difference in this proof from the 
corresponding result about statistical difference is that we need to 
preserve efficiency when going from T to T’. The algorithm T’ will 
naturally use the algorithm T as a subroutine. Thus this is a reduction 
in the same spirit as reductions used elsewhere in complexity theory 
(e.g., in the theory of NP-completeness). 

Suppose that there exists a nonuniform time t algorithm T such that 


|Pr[7(X*) = 1] — Pr[T(¥*) = 1]| > ke. (7.1) 


We can drop the absolute value in the above expression without 
loss of generality. (Otherwise we can replace T with its negation; recall 
that negations are free in our measure of circuit size.) 
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Now we will use a “hybrid argument.” Consider the hybrid 
distributions H; = ARAYI for i= 0,...,k. Note that Ho = X* and 
H, =Y*. Then Inequality (7.1) is equivalent to 


k 
S PT (Ha) = 1] — Pr[T (Hi) = 1] > ke, 
a 


since the sum telescopes. Thus, there must exist some i € [k] such that 
Pri (8,4) = 1] — Pr[T(H;) = 1] > e, i.e., 


Pr[T( XXY!) = 1] — Pr[T( X1 YY!) = 1] >e. 


By averaging, there exists some z1,...£k—i and Yk—i+2,;- -Yk such 
that 


Pr|[T(£1,...£k-i, X, Yk-i+2,---Yk) = 1] 
— Pr|T(z£1,...£k-i, Y, Yk-i+2,--- Yk) = 1] > €. 


Then, define T’(z) = T(£1,...Lk—i,Z,Yk-i+2---;Yk). Note that T' 
is a nonuniform algorithm with advice i, £1,...,Zk—i; YR—ita,---Yk 
hardwired in. Hardwiring these inputs costs nothing in terms of circuit 
size. Thus TJ” is a nonuniform time t algorithm such that 


Pr[T’(X) = 1] — Pr[T'(Y) = 1] >e, 
contradicting the indistinguishability of X and Y. o 


While the parameters in the above result behave nicely, with (t,e) 
going to (t,ke), there are some implicit costs. First, the amount of 
nonuniform advice used by 7” is larger than that used by T. This is 
hidden by the fact that we are using the same measure t (namely circuit 
size) to bound both the time and the advice length. Second, the result is 
meaningless for large values of k (e.g., k = t), because a time t algorithm 
cannot read more than t bits of the input distributions X* and Y*. 

We note that there is an analogue of the above result for computa- 
tional indistinguishability against uniform algorithms (Definition 7.2), 
but it is more delicate, because we cannot simply hardwire i, 
L1,---;Uk-i; Yk-i4+2;--->Yk as advice. Indeed, a direct analogue of 
the proposition as stated is known to be false. We need to add the 
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additional condition that the distributions X and Y are efficiently 
samplable. Then T” can choose i € [k] at random, and randomly 
sample £1,...,£k-i È X, Yeitas-+-5Uk — Y. 


7.3.2 Next-Bit Unpredictability 


In analyzing the pseudorandom generators that we construct, it will 
be useful to work with a reformulation of the pseudorandomness 
property, which says that given a prefix of the output, it should be 
hard to predict the next bit much better than random guessing. 

For notational convenience, we deviate from our usual conventions 
and write X; to denote the ith bit of random variable X, rather than 
the ith random variable in some ensemble. We have: 


Definition 7.15. Let X be a random variable distributed on {0,1}”. 
For t € N and e € [0,1], we say that X is (t,£) next-bit unpredictable if 
for every nonuniform probabilistic algorithm P running in time t and 
every i € [m], we have: 


1 
Pr[P(X1 X2 Xim) =X] Ste 


where the probability is taken over X and the coin tosses of P. 


Note that the uniform distribution X =Um is (t,0) next-bit 
unpredictable for every t. Intuitively, if X is pseudorandom, it must 
be next-bit unpredictable, as this is just one specific kind of test one 
can perform on X. In fact the converse also holds, and that will be 
the direction we use. 


Proposition 7.16. Let X be a random variable taking values in 
{0,1}”. If X is a (t,£) pseudorandom, then X is (t — O(1),£€) next-bit 
unpredictable. Conversely, if X is (t,£) next-bit unpredictable, then it 
is (t, m - £) pseudorandom. 


Proof. Here U denotes an r.v. uniformly distributed on {0,1}” and 
U; denotes the ith bit of U. 
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pseudorandom = next-bit unpredictable. The proof is by 
reduction. Suppose for contradiction that X is not (t — 3,¢) next-bit 
unpredictable, so we have a predictor P :{0,1} 71 — {0,1} that 
succeeds with probability at least 1/2 + £. We construct an algorithm 
T : {0,1}” — {0,1} that distinguishes X from Um as follows: 


1 ifP noti) = Bi 
Totem) =| i (oe i 1) i 


0 otherwise. 


T can be implemented with the same number of A and V gates as P, 
plus 3 for testing equality (via the formula (x A y) V (=a A 7y)). 


next-bit unpredictable = pseudorandom. Also by reduction. 
Suppose X is not pseudorandom, so we have a nonuniform algorithm 
T running in time t s.t. 


Pr[T(X) = 1] — Pr[T(U) = 1] > e, 


where we have dropped the absolute values without loss of generality 
as in the proof of Proposition 7.14. 
We now use a hybrid argument. Define H;= 
X 1X2 “ye * XiUi+1Ui+2 oe U Then Hm = X and Ho = U. We have: 
m 
5 Pera = E =a 
i=1 


since the sum telescopes. Thus, there must exist an 7 such that 
Pri T(H;) = 1] — Pr[T(Ai_1) = 1] > e/m. 


This says that T is more likely to output 1 when we put X; in the 
ith bit than when we put a random bit U;. We can view U; as being 
X; with probability 1/2 and being X; with probability 1/2. The only 
advantage T has must be coming from the latter case, because in the 
former case, the two distributions are identical. Formally, 


e/m < Pr[T(H;) = 1] — Pr{T(Hi-1) = 1] 
— Pr[T(X1 mae Xj-1XUi41 meee Um) = 1] 
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= G i Pr[T(Xı oy -Xi—-1XiUi+1 A Um) z 1] 
1 
2 


. Pr[T(X1 tee Xj 1 XUia1 PS Um) = i!) 


1 
= 2 (Pr[T (X41 ee Xi—-1XiUi+1 at Una) r: 1] 
— Pr[T(Xı cio Xj X Uist one Ui) = 1]). 
This motivates the following next-bit predictor: P(£1£2---£i—1): 
(1) Choose random bits uj,...,Um © {0,1}. 
(2) Compute b = T(a1-++aj~1Ui-++ Um). 
(3) If b=1, output u;, otherwise output T. 


The intuition is that T is more likely to output 1 when u; = z; than 
when u; = Ti. Formally, we have: 


Pr[ P(X --+X;_1) = Xi] 
= =. (Pr[T(Xy-+- Xj_1UUi41--Um) = 1|U; = Xi] 
+Pri[T(X1 +++ Xj_1UUi41++-Um) = 0|U; # Xi) 
(PoP OG Xa Uy) = 1 


+1 — Pr[T(X1---Xj_1XjUi41---Um) = 1]) 
is 1 + E 
2 m 


Note that as described P runs in time t + O(m). Recalling that we are 
using circuit size as our measure of nonuniform time, we can reduce the 
running time to t as follows. First, we may nonuniformly fix the coin 
tosses Ui,...,Um Of P while preserving its advantage. Then all P does 
is run T on z1- x;—1 concatenated with some fixed bits and and either 
output what T does or its negation (depending on the fixed value of u;). 
Fixing some input bits and negation can be done without increasing 
circuit size. Thus we contradict the next-bit unpredictability of X. 


We note that an analogue of this result holds for uniform distin- 
guishers and predictors, provided that we change the definition of 


230 Pseudorandom Generators 


next-bit predictor to involve a random choice of i È [m] instead of 
a fixed value of i, and change the time bounds in the conclusions to 
be t — O(m) rather than t — O(1) and t. (We can’t do tricks like in 
the final paragraph of the proof.) In contrast to the multiple-sample 
indistinguishability result of Proposition 7.14, this result does not 
need X to be efficiently samplable for the uniform version. 


7.4 Pseudorandom Generators from Average-Case Hardness 


In Section 7.2, we surveyed cryptographic pseudorandom generators, 
which have numerous applications within and outside cryptography, 
including to derandomizing BPP. However, for derandomization, we 
can use generators with weaker properties. Specifically, Theorem 7.5 
only requires G : {0,1} — {0,1} such that: 


(1) G fools (nonuniform) distinguishers running in time m (as 
opposed to all poly(m)-time distinguishers). 

(2) G is computable in time poly(m,24™) (i.e., G is mildly 
explicit). In particular, the PRG may take more time than 
the distinguishers it is trying to fool. 


Such a generator implies that every BPP algorithm can be deran- 
domized in time poly(n) - 24P!v(™), 

The benefit of studying such generators is that we can hope to con- 
struct them under weaker assumptions than used for cryptographic gen- 
erators. In particular, a generator with the properties above no longer 
seems to imply P Æ NP, much less the existence of one-way functions. 
(The nondeterministic distinguisher that tests whether a string is 
an output of the generator by guessing a seed needs to evaluate the 
generator, which takes more time than the distinguishers are allowed.) 

However, as shown in Problem 7.1, such generators still imply 
nonuniform circuit lower bounds for exponential time, something that 
is beyond the state of the art in complexity theory. 

Our goal in the rest of this section is to construct generators as 
above from assumptions that are as weak as possible. In this section, we 
will construct them from boolean functions computable in exponential 
time that are hard on average (for nonuniform algorithms), and in the 
section after we will relax this to only require worst-case hardness. 
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7.4.1 Average-Case Hardness 


A function is hard on average if it is hard to compute correctly on 
randomly chosen inputs. Formally: 


Definition 7.17. For s € N and ô € [0,1], we say that a Boolean func- 
tion f : {0,1} > {0,1} is (s,5) average-case hard if for all nonuniform 
probabilistic algorithms A running in time s, 


Pr[A(X) = f(X)] <1— ô, 


where the probability is taken over X and the coin tosses of A. 


Note that saying that f is (s,ô) hard for some 6 >0 (possibly 
exponentially small) amounts to saying that f is worst-case hard.! 
Thus, we think of average-case hardness as corresponding to values 
of ô that are noticeably larger than zero, e.g., 6 =1/s or ô= 1/3. 
Indeed, in this section we will take 6 = 1/2 — e for e =1/s. That is, no 
efficient algorithm can compute f much better than random guessing. 
A typical setting of parameters we use is s = s(/) somewhere in range 
from ÆC) (slightly superpolynomial) to s(¢)=2@ for a constant 
a>0. (Note that every function is computable by a nonuniform 
algorithm running in time roughly 2°, so we cannot take s(@) to be 
any larger.) We will also require f to be computable in (uniform) time 
2° go that our pseudorandom generator will be computable in time 
exponential in its seed length. The existence of such an average-case 
hard function may seem like a strong assumption, but in later sections 
we will see how to deduce it from a worst-case hardness assumption. 

Now we show how to obtain a pseudorandom generator from 
average-case hardness. 


Proposition 7.18. If f : {0,1} — {0,1} is (t,1/2 — £) average-case 
hard, then G(x) = x o f(x) is a (t,£) pseudorandom generator. 


l For probabilistic algorithms, the “right” definition of worst-case hardness is actually that 
there exists an input x for which Pr[A(x) = f(a)] < 2/3, where the probability is taken 
over the coin tosses of A. But for nonuniform algorithms two definitions can be shown to 
be roughly equivalent. See Definition 7.34 and the subsequent discussion. 
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We omit the proof of this proposition, but it follows from Prob- 
lem 7.5, Part 2 (by setting m = 1, a = 0, and d= £ in Theorem 7.24). 
Note that this generator includes its seed in its output. This is 
impossible for cryptographic pseudorandom generators, but is feasible 
(as shown above) when the generator can have more resources than 
the distinguishers it is trying to fool. 

Of course, this generator is quite weak, stretching by only one bit. 
We would like to get many bits out. Here are two attempts: 


e Use concatenation: Define G(x1--- £k) = £1: £k f (£1) 
f(a,). This is a (t,ke) pseudorandom generator because 
G(Ux¢) consists of k independent samples of a pseudorandom 
distribution and thus computational indistinguishability is 
preserved by Proposition 7.14. Note that already here we 
are relying on nonuniform indistinguishability, because the 
distribution (Ug, f(U¢)) is not necessarily samplable (in time 
that is feasible for the distinguishers). Unfortunately, how- 
ever, this construction does not improve the ratio between 
output length and seed length, which remains very close to 1. 

e Use composition: For example, try to get two 
bits out using the same seed length by defining 
G'(x) = G(G(a)1...0)G(x)e41, where G(x)... denotes 
the first @ bits of G(x). This works for cryptographic 
pseudorandom generators, but not for the generators we are 
considering here. Indeed, for the generator G(x) = xf (x) of 
Proposition 7.18, we would get G’(x) = xf(x) f(x), which is 
clearly not pseudorandom. 


7.4.2 The Pseudorandom Generator 


Our goal now is to show the following: 


Theorem 7.19. For s:N — N, suppose that there is a function 
f €E=DTIME(2°)? such that for every input length LEN, f 


? E should be contrasted with the larger class EXP = DTIME(2P°¥)). See Problem 7.2. 
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is (s(@),1/2 — 1/s(£)) average-case hard, where s(£) is computable in 
time 2°. Then for every m €N, there is a mildly explicit (m,1/m) 
pseudorandom generator G: {0,1}4™ — {0,1} with seed length 
d(m) = O(s~!(poly(m))?/logm) that is computable in time 2000), 


Note that this is similar to the seed length d(m) = 
poly(s~'(poly(m))) mentioned in Section 7.2 for constructing 
cryptographic pseudorandom generators from one-way functions, but 
the average-case assumption is incomparable (and will be weakened 
further in the next section). In fact, it is known how to achieve a seed 
length d(m) = O(s~!(poly(m))), which matches what is known for 
constructing pseudorandom generators from one-way permutations as 
well as the converse implication of Problem 7.1. We will not cover that 
improvement here (see the Chapter Notes and References for pointers), 
but note that for the important case of hardness s(@) = 2°, Theo- 
rem 7.19 achieves seed length d(m) = O(O(logm)?/logm) = O(logm) 
and thus P = BPP. More generally, we have: 


Corollary 7.20. Suppose that E has a (s(¢),1/2 — 1/s(@)) average- 
case hard function f : {0,1} > {0,1}. 


(1) If s(@) = 2°, then BPP =P. 
(2) If s(¢)=2@,, then BPP cP. 
(3) If s(¢) = &), then BPP c SUBEXP. 


NN e 


The idea is to apply f repeatedly, but on slightly dependent inputs, 
namely ones that share very few bits. The sets of seed bits used for 
each output bit will be given by a design: 


Definition 7.21. S1,...,Sm C [d] is an (¢,a)-design if 


(1) Vi, |S:| =£ 
(2) Vi Æ j, |S: N Sj <a 
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We want lots of sets having small intersections over a small 
universe. We will use the designs established by Problem 3.2: 


Lemma 7.22. For every constant y >0 and every ¢,m€N, there 
exists an (¢,a)-design S1,...,Sm C [d] with d = o(£) anda=vy- logm. 
Such a design can be constructed deterministically in time poly(m,d). 


The important points are that intersection sizes are only logarith- 
mic in the number of sets, and the universe size d is at most quadratic 
in the set size £ (and can be linear in £ in case we take m = 2°), 


Construction 7.23. (Nisan—Wigderson Generator) Given a 
function f : {0,1}¢ — {0,1} and an (£,a)-design $1,...,Sm C [d], define 
the Nisan—-Wigderson generator G : {0,1}¢ > {0,1} as 


G(x) = f(2|s,) F(ls.)--- Fels.) 


where if x is a string in {0,1}4 and S C [d], then 2|g is the string of 
length |S| obtained from x by selecting the bits indexed by S. 


Theorem 7.24. Let G: {0,1}¢— {0,1}™ be the Nisan—Wigderson 
generator based on a function f : {0,1} {0,1} and some (£a) 
design. If f is (s,1/2 — €/m) average-case hard, then G is a (t,£) 
pseudorandom generator, for t= s —m-a- 2%. 


Theorem 7.19 follows from Theorem 7.24 by setting €= 1/m 
and a=logm, and observing that for ¢=s~'(m3), then 
t= s(l) —-m-a-2*%>™mM, so we have an (m,1/m) pseudorandom gen- 
erator. The seed length is d = O(¢?/logm) = O(s~!(poly(m))?/logm). 


Proof. Suppose G is not a (t,£) pseudorandom generator. By Proposi- 
tion 7.16, there is a nonuniform time t next-bit predictor P such that 


PrIP(f(XIs,)f(Xls)--f(Xl5.4)) =FAl > 5 +=, 72) 


for some i€ [|m]. From P, we construct A that computes f with 
probability greater than 1/2 + €/m. 
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Let Y = X|s,. By averaging, we can fix all bits of X|z = z (where 
Si is the complement of S) such that the prediction probability 
remains greater than 1/2 + €/m (over Y and the coin tosses of the 
predictor P). Define f(y) = f(als,) for j € {1,...,i — 1}. (That is, 
f;(y) forms x by placing y in the positions in S; and z in the others, 
and then applies f to a|s,). Then 

1 E 
Prl PAY) fi-1(Y)) = f(Y)] > Roe 

Note that f;(y) depends on only |S; N 5;| < a bits of y. Thus, we 
can compute each fj with a look-up table, which we can include in the 
advice to our nonuniform algorithm. Indeed, every function on a bits 
can be computed by a boolean circuit of size at most a- 2°. (In fact, 
size at most O(2°/a) suffices.) 

Then, defining A(y) = P(fi(y)--: fi-1(y)), we deduce that A(y) can 
be computed with error probability smaller than 1/2 — ¢/m in nonuni- 
form time less than t+ m-a-2%=s. This contradicts the hardness 


of f. Thus, we conclude G is an (m,¢) pseudorandom generator. 


Some additional remarks on this proof: 


(1) This is a very general construction that works for any 
average-case hard function f. We only used f € E to deduce 
G is computable in E. 

(2) The reduction works for any nonuniform class of algorithms 
C where functions of logarithmically many bits can be 
computed efficiently. 


Indeed, in the next section we will use the same construction to 
obtain an unconditional pseudorandom generator fooling constant- 
depth circuits, and will later exploit the above “black-box” properties 
even further. 

As mentioned earlier, the parameters of Theorem 7.24 have been 
improved in subsequent work, but the newer constructions do not have 
the clean structure of Nisan—Wigderson generator, where the seed of 
the generator is used to generate m random but correlated evaluation 
points, on which the average-case hard function f is evaluated. Indeed, 
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each output bit of the improved generators depends on the entire 
truth-table of the function f, translating to a construction of signifi- 
cantly higher computational complexity. Thus the following remains an 
interesting open problem (which would have significance for hardness 
amplification as well as constructing pseudorandom generators): 


Open Problem 7.25. For every 4,s €N, construct an explicit 
generator H : {0,1}0 > ({0,1}5)™ with m=s°™ such that if f 
is (s,1/2 —1/s) average-case hard and we define G(x) = f(Hı(x)) 
f(H2(x))---f(Hm(x)) where H;(x), denotes the ith component of 
H (a), then G is an (m,1/m) pseudorandom generator. 


7.4.3  Derandomizing Constant-depth circuits 


Definition 7.26. An unbounded fan-in circuit C (z£1,...,£n) has input 
gates consisting of variables x;, their negations 47;, and the constants 
0 and 1, as well as computation gates, which can compute the AND 
or OR of an unbounded number of other gates (rather than just 2, as 
in usual Boolean circuits).? The size of such a circuit is the number of 
computation gates, and the depth is the maximum of length of a path 
from an input gate to the output gate. 

AC?” is the class of functions f : {0,1}* — {0,1} for which there 
exist constants c and k and a uniformly constructible sequence 
of unbounded fan-in circuits (Cy)nen such that for all n, Cn has 
size at most n° and depth at most k, and for all æ € {0,1}”, 
Ch(x) = f(x). Uniform constructibility means that there is an efficient 
(e.g., polynomial-time) uniform algorithm M such that for all n, 
M(1") = Cn (where 1” denotes the number n in unary, i.e., a string of 
n 1s). BPAC® defined analogously, except that C;, may have poly(n) 
extra inputs, which are interpreted as random bits, and we require 
Pr,[Cn(x,r) = f(x)] > 2/3. 


3 Note that it is unnecessary to allow internal NOT gates, as these can always be pushed 
to the inputs via DeMorgan’s Laws at no increase in size or depth. 
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AC® is one of the richest circuit classes for which we have 
superpolynomial lower bounds: 


Theorem 7.27. For all constant k € N and every 4€ N, the func- 
tion PAR : {0,1} > {0,1} defined by PAR¢(a1,...,2¢) = a, £i is 
(sk(£), 1/2 — 1/s,(@))-average-case hard for nonuniform unbounded 
fan-in circuits of depth k and size sg(0) = 22"). 


The proof of this result is beyond the scope of this survey; see the 
Chapter Notes and References for pointers. 

In addition to having an average-case hard function against 
AC®, we also need that AC? can compute arbitrary functions on a 
logarithmic number of bits. 


Lemma 7.28. Every function g: {0,1}* > {0,1} can be computed 
by a depth 2 circuit of size 2°. 


Using these two facts with the Nisan—-Wigderson pseudoran- 
dom generator construction, we obtain the following pseudorandom 
generator for constant-depth circuits. 


Theorem 7.29. For every constant k and every m, there exists 
a poly(m)-time computable (m,1/m)-pseudorandom generator 
Gm : {0,1}8° m — {0,1} fooling unbounded fan-in circuits of 
depth k (and size m). 


Proof. This is proven similarly to Theorems 7.19 and 7.24, except 
that we take f = PAR, rather than a hard function in E, and we 
observe that the reduction can be implemented in a way that increases 
the depth by only an additive constant. Specifically, to obtain a 
pseudorandom generator fooling circuits of depth k and size m, we 
use the hardness of PAR» against unbounded fan-in circuits of depth 
k'=k+2 and size m?, where £= tp (m?) = O(log” m). Then the 
seed length of G is O(02/a) < O(@) = log?) m. 
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We now follow the steps of the proof of Theorem 7.19 to go from 
an adversary T of depth k violating the pseudorandomness of G to a 
circuit A of depth k’ calculating the parity function PARy. 

If T has depth k, then it can be verified that the next-bit predictor 
P constructed in the proof of Proposition 7.16 also has depth k. (Recall 
that negations and constants can be propagated to the inputs so they 
do not contribute to the depth.) Next, in the proof of Theorem 7.24, 
we obtain A from P by A(y) = P(fi(y)foly)--: fi-i(y)) for some 
i € {1,...,m} and where each f; depends on at most a bits of y. Now 
we observe that A can be computed by a small constant-depth circuit 
(if P can). Specifically, applying Lemma 7.28 to each f;, the size of 
A is at most (m — 1) - 2% + m = m? and the depth of A is at most 
k' =k + 2. This contradicts the hardness of PAR». 


Corollary 7.30. BPAC® c P. 


With more work, this can be strengthened to actually put BPAC® 
in AC, i.e., uniform constant-depth circuits of quasipolynomial size. 
(The difficulty is that we use majority voting in the derandomization, 
but small constant-depth circuits cannot compute majority. However, 
they can compute an “approximate” majority, and this suffices.) 

The above pseudorandom generator can also be used to give 
a quasipolynomial-time derandomization of the randomized algo- 
rithm we saw for approximately counting the number of satisfying 
assignments to a DNF formula (Theorem 2.34); see Problem 7.4. 

Improving the running time of either of these derandomizations to 
polynomial is an intriguing open problem. 


Open Problem 7.31. Show that BPAC®°=AC® or even 
BPAC® CP. 


Open Problem 7.32 (Open Problem 2.36, restated). Give a 
deterministic polynomial-time algorithm for approximately counting 
the number of satisfying assignments to a DNF formula. 
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We remark that it has recently been shown how to give an average- 
case AC® simulation of BPAC® (i.e., the derandomized algorithm is 
correct on most inputs); see Problem 7.5. 

Another open problem is to construct similar, unconditional 
pseudorandom generators as Theorem 7.29 for circuit classes larger 
than AC®. A natural candidate is AC°[2], which is the same as AC? 
but augmented with unbounded-fan-in parity gates. There are known 
explicit functions f : {0,1}£ — {0,1} (e.g., Majority) for which every 
AC?[2] circuit of depth k computing f has size at least s;,(¢) = gee. 
but unfortunately the average-case hardness is much weaker than 
we need. These functions are only (sz(¢),1/2 — 1/O(¢))-average-case 
hard, rather than (s(£),1/2 — 1/s;,,(@))-average-case hard, so we can 
only obtain a small stretch using Theorem 7.24 and the following 
remains open. 


Open Problem 7.33. For every constant k and every m, construct a 
(mildly) explicit (m,1/4)-pseudorandom generator G'n : {0,17 > 
{0,1} fooling AC°[2] circuits of depth k and size m. 


7.5 Worst-Case/Average-Case Reductions and 
Locally Decodable Codes 


In the previous section, we saw how to construct pseudorandom 
generators from boolean functions that are very hard on average, 
where every nonuniform algorithm running in time ¢ must err with 
probability greater than 1/2 — 1/t on a random input. Now we want 
to relax the assumption to refer to worst-case hardness, as captured 
by the following definition. 


Definition 7.34. A function f : {0,1}*— {0,1} is worst-case hard 
for time t if, for all nonuniform probabilistic algorithms A running in 
time t, there exists x € {0,1} such that Pr[A(x) 4 f(x)] > 1/3, where 
the probability is over the coin tosses of A. 


Note that, for deterministic algorithms A, the definition simply says 


Jx A(x) # f(x). In the nonuniform case, restricting to deterministic 
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algorithms is without loss of generality because we can always deran- 
domize the algorithm using (additional) nonuniformity. Specifically, 
following the proof that BPP c P/poly, it can be shown that if f 
is worst-case hard for nonuniform deterministic algorithms running 
in time t, then it is worst-case hard for nonuniform probabilistic 
algorithms running in time t for some t = Q(t/2). 

A natural goal is to be able to construct an average-case hard func- 
tion from a worst-case hard function. More formally, given a function 
f : {0,1}¢ — {0,1} that is worst-case hard for time t = t(l), construct a 
function f : {0,1}° — {0,1} such that f is average-case hard for time 
t = t?) , Moreover, we would like f to be in E if f is in E. (Whether we 
can obtain a similar result for NP is a major open problem, and indeed 
there are negative results ruling out natural approaches to doing so.) 

Our approach to doing this will be via error-correcting codes. 
Specifically, we will show that if f is the encoding of f in an appropri- 
ate kind of error-correcting code, then worst-case hardness of f implies 
average-case hardness of f. 

Specifically, we view f as a message of length L = 2° and apply an 
error-correcting code Enc: {0, 1} + D to obtain f = Enc(f), which 
we view as a function f : {0,1} > £, where (= log L. Pictorially: 


[message f : {0,1} > {0,1}] — [Enc] — [codeword f: {0,1} > 3}. 


(Ultimately, we would like © = {0,1}, but along the way we will 
discuss larger alphabets.) 

Now we argue the average-case hardness of f as follows. Suppose, 
for contradiction, that f is not 6 average-case hard. By definition, 
there exists an efficient algorithm A with Pr[A(«) = f(«)] >1—. 
We may assume that A is deterministic by fixing its coins. Then A 
may be viewed as a received word in 4”, and our condition on A 
becomes dy(A,f) <6. So if Dec is a 5-decoding algorithm for Enc, 
then Dec(A) = f. By assumption A is efficient, so if Dec is efficient, 
then f may be efficiently computed everywhere. This would contradict 
our worst-case hardness assumption, assuming that Dec(A) gives a 
time ¢(¢) algorithm for f. However, the standard notion of decoding 
requires reading all 2° values of the received word A and writing all 2° 
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values of the message Dec(A), and thus Time(Dec(A)) > 2°. But every 
function on £ bits can be computed in nonuniform time 2°, and even 
in the uniform case we are mostly interested in t(¢) < 2°. To solve this 
problem we introduce the notion of local decoding. 


Definition 7.35. A local 6-decoding algorithm for a code 
Enc: {0,1}4 > =” is a probabilistic oracle algorithm Dec with 
the following property. Let f:[L]— {0,1} be any message with 
associated codeword f =Enc(f), and let g: [Ê] + be such that 
di(g,f) <6. Then for all x € [L] we have Pr[Dec9(x) = f(x)] > 2/3, 


where the probability is taken over the coins flips of Dec. 


In other words, given oracle access to g, we want to efficiently 
compute any desired bit of f with high probability. So both the input 
(namely g) and the output (namely f) are treated implicitly; the 
decoding algorithm does not need to read/write either in its entirety. 
Pictorially: 


g 


oracle access 


| 


x ——> Dec — f(x) 


This makes it possible to have sublinear-time (or even 
polylogarithmic-time) decoding. Also, we note that the bound of 2/3 
in the definition can be amplified in the usual way. Having formalized a 
notion of local decoding, we can now make our earlier intuition precise. 


Proposition 7.36. Let Enc be an error-correcting code with local 
6-decoding algorithm Dec that runs in nonuniform time at most tpec 
(meaning that Dec is an boolean circuit of size at most tpec equipped 
with oracle gates), and let f be worst-case hard for nonuniform time 
t. Then f = Enc(f) is (t/,6) average-case hard, where t = t/tpec. 
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Proof. We do everything as explained before except with Dec“ in place 
of Dec(A), and now the running time is at most Time(Dec) - Time(A). 
(We substitute each oracle gate in the circuit for Dec with the circuit 
for A.) o 


We note that the reduction in this proof does not use nonuniformity 
in an essential way. We used nonuniformity to fix the coin tosses of A, 
making it deterministic. To obtain a version for hardness against uni- 
form probabilistic algorithms, the coin tosses of A can be chosen and 
fixed randomly instead. With high probability, the fixed coins will not 
increase As error by more than a constant factor (by Markov’s Inequal- 
ity); we can compensate for this by replacing the (t’,d) average-case 
hardness in the conclusion with, say, (t’,6/3) average-case hardness. 

In light of the above proposition, our task is now to find an error- 
correcting code Enc : {0,1}4 — X} with a local decoding algorithm. 
Specifically, we would like the following parameters. 


(1) We want £=O(é), or equivalently L = poly(L). This is 
because we measure hardness as a function of input length 
(which in turn translates to the relationship between output 
length and seed length of pseudorandom generators obtained 
via Theorem 7.19). In particular, when t = 22 | we'd like 
to achieve t = 2°, Since t < t in Proposition 7.36, this is 
only possible if ê= O(é). . 

(2) We would like Enc to be computable in time 2° = poly(Ê), 
which is poly(L) if we satisfy the requirement L= poly(L). 
This is because we want f € E to imply f EE. 

(3) We would like £ = {0,1} so that f is a boolean function, and 
ô = 1/2 — € so that f has sufficient average-case hardness for 
the pseudorandom generator construction of Theorem 7.24. 

(4) Since f will be average-case hard against time t/ = t/tDec, We 
would want the running time of Dec to be tpec = poly (4,1/e) 
so that we can take e = t?®) and still have t = t?® /poly(£). 


Of course, achieving 6 = 1/2 — € is not possible with our current 
notion of local unique decoding (which is only harder than the 
standard notion of unique decoding), and thus in the next section 
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we will focus on getting 6 to be just a fixed constant. In Section 7.6, 
we will introduce a notion of local list decoding, which will enable 
decoding from distance ô = 1/2 — e. 

In our constructions, it will be more natural to focus on the task 
of decoding codeword symbols rather than message symbols. That is, 
we replace the message f with the codeword f in Definition 7.35 to 
obtain the following notion: 


Definition 7.37 (Locally Correctible Codes).* A local 5-correcting 
algorithm for a code C C X} is a probabilistic oracle algorithm Dec with 
the following property. Let f € C be any codeword, and let g : [Ê] >D 
be such that dy(g,f) <6. Then for all x € [L] we have Pr[Dec9(x) = 


f(x)| > 2/3, where the probability is taken over the coin flips of Dec. 


This implies the standard definition of locally decodable codes 
under the (mild) constraint that the message symbols are explicitly 
included in the codeword, as captured by the following definition (see 
also Problem 5.4). 


Definition 7.38 (Systematic Encodings). An encoding algo- 
rithm Enc: {0,1}“ +C for a code C C DY is systematic if there is 
a polynomial-time computable function J :[L]— [Î] such that for 
all f € {0,1}4, f =Enc(f), and all x € [L], we have f(I(x)) = f(z), 


where we interpret 0 and 1 as elements of © in some canonical way. 


Informally, this means that the message f can be viewed as the 
restriction of the codeword f to the coordinates in the image of I. 


Lemma 7.39. If Enc: {0,1}” > C is systematic and C has a local ô- 
correcting algorithm running in time t, then Enc has a local 6-decoding 
algorithm (in the standard sense) running in time t + poly(log L). 


Proof. If Dec, is the local corrector for C and IJ the mapping in the 
definition of systematic encoding, then Dec$(x) = Dec#(I(z)) is a local 
decoder for Enc. o 


4In the literature, these are often called self-correctible codes. 


244 Pseudorandom Generators 


7.5.1 Local Decoding Algorithms 


Hadamard Code. Recall the Hadamard code of message length 
m, which consists of the truth tables of all Zə-linear functions 
c: {0,1} — {0,1} (Construction 5.12). 


Proposition 7.40. The Hadamard code C C {0,1}? of message 
length m has a local (1/4 — ¢)-correcting algorithm running in time 


poly(m,1/e). 


Proof. We are given oracle access to g: {0,1} — {0,1} that is at 
distance less than 1/4 — € from some (unknown) linear function c, and 
we want to compute c(x) at an arbitrary point x € {0,1}. The idea 
is random self-reducibility: we can reduce computing c at an arbitrary 
point to computing c at uniformly random points, where g is likely 
to give the correct answer. Specifically, c(x) =c(a @r) @c(r) for 
every r, and both xz @r and r are uniformly distributed if we choose 
r & {0,1}™. The probability that g differs from c at either of these 
points is less than 2 - (1/4 — £) = 1/2 — 2e. Thus g(x @ r) © g(r) gives 
the correct answer with probability noticeably larger than 1/2. We can 
amplify this success probability by repetition. Specifically, we obtain 
the following local corrector: 


Algorithm 7.41 (Local Corrector for Hadamard Code). 
Input: An oracle g: {0,1} — {0,1}, x € {0,1}, and a parame- 
tere >0 


(1) Choose r1,... r} & {0,1}, for t = O(1/e?). 
(2) Query g(r;) and g(r; ® x) for each i= 1,...,¢. 
(3) Output maj) <j<,{g9(Ti) ® g(ri ® 2) }. 


If dy(g,c) <1/4—e, then this algorithm will output c(x) with 
probability at least 2/3. Oo 


This local decoding algorithm is optimal in terms of its decoding 
distance (arbitrarily close to 1/4) and running time (logarithmic in 
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the blocklength), but the problem is that the Hadamard code has 
exponentially small rate. 


Reed-Muller Code. Recall that the g-ary Reed-Muller code of 
degree d and dimension m consists of all multivariate polynomials 
p: Fj’ — F4 of total degree at most d. (Construction 5.16.) This code 
has minimum distance 6 = 1 — d/q. Reed-Muller Codes are a common 
generalization of both Hadamard and Reed-Solomon codes, and thus 
we can hope that for an appropriate setting of parameters, we will 
be able to get the best of both kinds of codes. That is, we want to 
combine the efficient local decoding of the Hadamard code with the 
good rate of Reed-Solomon codes. 


Theorem 7.42. The q-ary Reed-Muller Code of degree d and 
dimension m has a local 1/12-correcting algorithm running in time 
poly(m,q) provided d < q/9 and q > 36. 


Note the running time of the decoder is roughly the mth root 
of the block length [= q™. When m=1, our decoder can query 
the entire string and we simply obtain a global decoding algorithm 
for Reed-Solomon Codes (which we already know how to achieve 
from Theorem 5.19). But for large enough m, the decoder can only 
access a small fraction of the received word. (In fact, one can improve 
the running time to poly(m,d,logq), but the weaker result above is 
sufficient for our purposes.) 

The key idea behind the decoder is to do restrictions to random 
lines in F™. The restriction of a Reed—Muller codeword to such a line 
is a Reed-Solomon codeword, and we can afford to run our global 
Reed-Solomon decoding algorithm on the line. 

Formally, for x,y € F”, we define the (parameterized) line through 
x in direction y as the function lz y : F + F” given by éz,y(t) =x + ty. 
Note that for every a € F,b € F \ {0}, the line Cp +¢y,5, has the same 
set of points in its image as lyy; we refer to this set of points as an 
unpolymerized line. When y = 0, the parameterized line contains only 
the single point x. 
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If g: F” > F is any function and £: F > F” is a line, then we use 
gle to denote the restriction of g to £, which is simply the composition 
gol:F-F. Note that if p is any polynomial of total degree at most 
d, then p|¢ is a (univariate) polynomial of degree at most d. 

So we are given an oracle g of distance less than 6 from some 
degree d polynomial p: F™ — F, and we want to compute p(x) for 
some x € F™. We begin by choosing a random line £ through x. Every 
point of F” \ {x} lies on exactly one parameterized line through zx, 
so the points on £ (except x) are distributed uniformly at random 
over the whole domain, and thus g and p are likely to agree on these 
points. Thus we can hope to use the points on this line to reconstruct 
the value of p(x). If 6 is sufficiently small compared to the degree 
(e.g., 0 = 1/3(d + 1)), we can simply interpolate the value of p(x) from 
d + 1 points on the line. This gives rise to the following algorithm. 


Algorithm 7.43 (Local Corrector for Reed-Muller Code I). 
Input: An oracle g: F” >F, an input «¢€F™”, and a degree 
parameter d 


(1) Choose y & F”. Let £= lyy : F + F™ be the line through x 
in direction y. 

(2) Query g to obtain bo = g|e(ao) = g(f(a0)),---,Ga = glelaa) = 
g(l(aq)), where ao,...,a@q € F \ {0} are any fixed points 

(3) Interpolate to find the unique univariate polynomial q of 
degree at most d s.t. Vi,q(a;) = bi 

(4) Output q(0) 


Claim 7.44 If g has distance less than ô= 1/3(d + 1) from some 
polynomial p of degree at most d, then Algorithm 7.43 will output 
p(x) with probability greater than 2/3. 


Proof of Claim: Observe that for all z € F” and a; €F \ {0}, 
lr ylai) is uniformly random in F™ over the choice of y € F™. This 
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implies that for each i, 


1 


Prlgle(ai) # Ple(ai)] < 6 = 3(d-+ 1) 


By a union bound, 


Pri, gle(as) # ple(aa)] < (d+ 1) -5= 5, 


Thus, with probability greater than 2/3, we have Vi,q(a;) = pļlelai) 
and hence q(0)=p(x). The running time of the algorithm 
is poly(m,q). o 


We now show how to improve the decoder to handle a larger fraction 
of errors, up to distance 6 = 1/12. We alter Steps 7.43 and 7.43 in the 
above algorithm. In Step 7.43, instead of querying only d+ 1 points, 
we query over all points in £. In Step 7.43, instead of interpolation, 
we use a global decoding algorithm for Reed-Solomon codes to decode 
the univariate polynomial p|. Formally, the algorithm proceeds as 
follows. 


Algorithm 7.45 (Local Corrector for Reed-Muller Codes II). 
Input: An oracle g: F” > F, an input x € F”, and a degree parame- 
ter d, where q = |F| > 36 and d < q/9. 


(1) Choose y & F™. Let = l, y : F + F” be the line through x 
in direction y. 

(2) Query g at all points on £ to obtain gļe : F > F. 

(3) Run the 1/3-decoder for the g-ary Reed-Solomon code of 
degree d on gle to obtain the (unique) polynomial q at 
distance less than 1/3 from g|¢ (if one exists).° 

(4) Output q(0). 


5 A 1/3-decoder for Reed-Solomon codes follows from the (1 — 2,/d/q) list-decoding algo- 
rithm of Theorem 5.19. Since 1/3 < 1 — 2,/d/q, the list-decoder will produce a list con- 
taining all univariate polynomials at distance less than 1/3, and since 1/3 is smaller than 
half the minimum distance (1 — d/q), there will be only one good decoding. 


248 Pseudorandom Generators 


Claim 7.46. If g has distance less than ô = 1/12 from some polynomial 
p of degree at most d, and the parameters satisfy q = |F| > 36, d < q/9, 
then Algorithm 7.45 will output p(x) with probability greater than 2/3. 


Proof of Claim: The expected distance (between g|¢ and p|¢) is small: 


1 1 1 1 
Eld ; r E E 
E[ H (gle, ple)] i 3 36° 129 
where the term 1/q is due to the fact that the point x is not random. 


Therefore, by Markov’s Inequality, 


Prida(gle,ple) = 1/3] < 1/3. 


Thus, with probability at least 2/3, we have that ple is the unique 
polynomial of degree at most d at distance less than 1/3 from g|~ and 
thus q must equal pļe. Oo 


7.5.2 Low-Degree Extensions 


Recall that to obtain locally decodable codes from locally correctible 
codes (as constructed above), we need to exhibit systematic encoding: 
(Definition 7.38.) Thus, given f : [L] > {0,1}, we want to encode it as 
a Reed-Muller codeword f : [L] > E s.t.: 


e The encoding time is 2° = poly( L). 
e /=O(2), or equivalently Ê = poly(L). 
e The code is systematic in the sense of Definition 7.38. 


Note that the usual encoding for Reed—Muller codes, where the mes- 
sage gives the coefficients of the polynomial, is not systematic. Instead 
the message should correspond to evaluations of the polynomial at 
certain points. Once we settle on the set of evaluation points, the task 
becomes one of interpolating the values at these points (given by the 
message) to a low-degree polynomial defined everywhere. 

The simplest approach is to use the boolean hypercube as the set 
of evaluation points. 
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Lemma 7.47. (multilinear extension) For every f: {0,1} — {0,1} 
and every finite field F, there exists a (unique) polynomial f: F> F 
such that Fleoaye = f and f has degree at most 1 in each variable (and 
hence total degree at most £). 


Proof. We prove the existence of the polynomial f. Define 


F(ai,....22) = y fla)ba(a) 


a€{0,1}4 


bal) = ( JI zı) ( I] = z3) 
i: ag=l i: a;=0 


Note that for xe {0,1}, da(x)=1 only when a= xv, therefore 
Flroay¢ = f. We omit the proof of uniqueness. The bound on the 
individual degrees is by inspection. o 


for 


Thinking of f as an encoding of f, let’s inspect the properties of 
this encoding. 


e Since the total degree of the multilinear extension can be 
as large as ¢, we need q > 9 for the local corrector of 
Theorem 7.42 to apply. _ 

e The encoding time is 2?, as computing a single point of 
f requires summing over 2° elements, and we have 2° points 
on which to compute f. 

e The code is systematic, since f is an extension of f. 

e However, the input length is ¢ = llogqg = O(log £), which is 
slightly larger than our target of ¢ = O(£). 


To solve the problem of the input length Ê in the multilinear 
encoding, we reduce the dimension of the polynomial f by changing 
the embedding of the domain of f: Instead of interpreting {0,1} c F° 
as an embedding of the domain of f in Ff, we map {0,1}° to H™ for 
some subset H c F, and as such embed it in F””. 
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More precisely, we fix a subset H C F of size |H| = [y/q]. Choose 
m = [é/log|H|], and fix some efficient one-to-one mapping from {0,1} 
into H™. With this mapping, view f as a function f: H” —> F. 
Analogously to before, we have the following. 


Lemma 7.48. (low-degree extension) For every finite field F, 
HCF, meN, and function f:H™ — F, there exists a (unique) 
f: F™ — F such that flar = f and f has degree at most H| — 1 in 
each variable (and hence has total degree at most m - (|H| — 1)). 


Using |H| = | ,/q], the total degree of f is at most d= £,/q. So we 
can apply the local corrector of Theorem 7.42, as long as q > 81? (so 
that d < q/9). Inspecting the properties of f as an encoding of f, we 
have: 


e The input length is £=m- logq = [¢/log|HI] - logg = O(A, 
as desired. (We can use a field of size 2% for k € N, so that Ê 
is a power of 2 and we incur no loss in encoding inputs to f 
as bits.) 

e The code is systematic as long as our mapping from {0,1} 
to H is efficient. 


Note that not every polynomial of total degree at most m - (H| — 1) 
is the low-degree extension of a function f : H™ — F, so the image of 
our encoding function f > f is only a subcode of the Reed—Muller code. 
This is not a problem, because any subcode of a locally correctible 


code is also locally correctible, and we can afford the loss in rate (all 
we need is £= O(¢)). 


7.5.3 Putting It Together 
Combining Theorem 7.42 with Lemmas 7.48, and 7.39, we obtain the 


following locally decodable code: 


Proposition 7.49. For every L € N, there is an explicit code Enc: 
{0,1}4 + 54, with blocklength L = poly(L) and alphabet size |£] = 
poly (log L), that has a local (1/12)-decoder running in time poly (log L). 
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Using Proposition 7.36, we obtain the following conversion from 
worst-case hardness to average-case hardness: 


Proposition 7.50. If there exists f: {0,1} > {0,1} in E that is 
worst-case hard against (nonuniform) time t(@), then there exists 
f: {0,139 — {0,1} 00°e4 in E that is (t/(2),1/12) average-case hard 
for t'(£) = t(0)/poly(é). 


This differs from our original goal in two ways: f is not Boolean, 
and we only get hardness 1/12 (instead of 1/2 — £). The former concern 
can be remedied by concatenating the code of Proposition 7.49 with 
a Hadamard code, similarly to Problem 5.2. Note that the Hadamard 
code is applied on message space X, which is of size polylog(L), so it 
can be 1/4-decoded by brute-force in time polylog(L) (which is the 
amount of time already taken by our decoder).® Using this, we obtain: 


Theorem 7.51. For every LEN, there is an explicit code 
Enc: {0,1} > {0,1}4 with blocklength L= poly(L) that has a 
local (1/48)-decoder running in time poly (log L). 


Theorem 7.52. If there exists f: {0,1} — {0,1} in E that is worst- 
case hard against time ¢(@), then there exists f: {0,1} — {0,1} in 
E that is (t/(2),1/48) average-case hard, for t/(£) = t(£)/poly (4). 


An improved decoding distance can be obtained using Problem 7.7. 

We note that the local decoder of Theorem 7.51 not only runs 
in time poly(log L), but also makes poly(logL) queries. For some 
applications (such as Private Information Retrieval, see Problem 7.6), 
it is important to have the number q of queries be as small as possible, 
ideally a constant. Using Reed—Muller codes of constant degree, it 
is possible to obtain constant-query locally decodable codes, but the 


6Some readers may recognize this concatenation step as the same as applying the 
“Goldreich—Levin hardcore predicate” to f. (See Problems 7.12 and 7.13.) However, for 
the parameters we are using, we do not need the power of these results, and can afford to 
perform brute-force unique decoding instead. 
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blocklength will be L=exp(L'/(-)), In a recent breakthrough, it 
was shown how to obtain constant-query locally decodable codes with 
blocklength L = exp(L°)). Obtaining polynomial blocklength remains 
open. 


Open Problem 7.53. Are there binary codes that are locally 
decodable with a constant number of queries (from constant distance 
ô > 0) and blocklength polynomial in the message length? 


7.5.4 Other Connections 


As shown in Problem 7.6, locally decodable codes are closely related 
to protocols for private information retrieval. Another connection, and 
actually the setting in which these local decoding algorithms were first 
discovered, is to program self-correctors. Suppose we have a program for 
computing a function, such as the Determinant, which happens to be 
a codeword in a locally decodable code (e.g., the determinant is a low- 
degree multivariate polynomial, and hence a Reed—Muller codeword). 
Then, even if this program has some bugs and gives the wrong answer 
on some small fraction of inputs, we can use the local decoding algo- 
rithm to obtain the correct answer on all inputs with high probability. 


7.6 Local List Decoding and PRGs from 
Worst-Case Hardness 


7.6.1 Hardness Amplification 


In the previous section, we saw how to use locally decodable codes to 
convert worst-case hard functions into ones with constant average-case 
hardness (Theorem 7.52). Now our goal is to amplify this hardness 
(e.g., to 1/2 — €). 

There are some generic techniques for hardness amplification. In 
these methods, we evaluate the function on many independent inputs. 
For example, consider f’ that concatenates the evaluations of f on k 
independent inputs: 


f'(@1,.-.,0r) = (f(x1),-..,f(xp)). 
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Intuitively, if f is 1/12 average-case hard, then f’ should be (1 — 
(11/12)*)-average case hard because any efficient algorithm can solve 
each instance correctly with probability at most 11/12. Proving this 
is nontrivial (because the algorithm trying to compute f’ need not 
behave independently on the k instances), but there are Direct Prod- 
uct Theorems showing that the hardness does get amplified essentially 
as expected. Similarly, if we take the XOR on k independent inputs, the 
XOR Lemma says that the hardness approaches 1/2 exponentially fast. 

The main disadvantage of these approaches (for our purposes) is 
that the input length of f’ is kl while we aim for input length of O(¢). 
To overcome this problem, it is possible to use derandomized products, 
where we evaluate f on correlated inputs instead of independent ones. 

We will take a different approach, generalizing the notion and 
algorithms for locally decodable codes to locally list-decodable codes, 
and thereby directly construct f that is (1/2 — ¢)-hard. Nevertheless, 
the study of hardness amplification is still of great interest, because it 
(or variants) can be employed in settings where doing a global encod- 
ing of the function is infeasible (e.g., for amplifying the average-case 
hardness of functions in complexity classes lower than E, such as 
NP, and for amplifying the security of cryptographic primitives). We 
remark that results on hardness amplification can be interpreted in 
a coding-theoretic language as well, as converting locally decodable 
codes with a small decoding distance into locally list-decodable codes 
with a large decoding distance. (See Section 8.2.3.) 


7.6.2 Definition 


We would like to formulate a notion of local list-decoding to enable us 
to have binary codes that are locally decodable from distances close 
to 1/2. This is somewhat tricky to define — what does it mean to 
produce a “list” of decodings when only asked to decode a particular 
coordinate? Let g be our received word, and A, fo,..., fs the codewords 
that are close to g. One option would be for the decoding algorithm, 
on input x, to output a set of values Dec’ (x) C © that is guaranteed 
to contain f (x), fo(x),... fs(x) with high probability. However, this is 
not very useful; in the common case that s > |X|, the list could always 
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be Dec9(x) = X. Rather than outputting all of the values, we want to 
be able to specify to our decoder which fi(z) to output. We do this 
with a two-phase decoding algorithm (Dec ;,Dec2z), where both phases 
can be randomized. 


(1) Dec, using g as an oracle and not given any other input 
other than the parameters defining the code, returns a list 
of advice strings @1,@2,...,a@s;, which can be thought of as 
“labels” for each of the codewords close to g. 

(2) Dec (again, using oracle access to g), takes input x and aj, 
and outputs f;(«). 


The picture for Dec is much like our old decoder, but it takes an 
extra input a; corresponding to one of the outputs of Decy: 


2 


oracle access 


Dec_2 = f i(x) 


More formally: 


Definition 7.54. A local 6-list-decoding algorithm for a code Enc is 
a pair of probabilistic oracle algorithms (Dec;,Dec2) such that for all 
received words g and all codewords f = Enc(f) with dy( f ,g) < ô, the 
following holds. With probability at least 2/3 over (a1,...,as) 4+ Decf, 
there exists an i € [s] such that 


Va, Pr[Dec3 (x, a;i) = f(x)] > 2/3. 


Note that we don’t explicitly require a bound on the list size s 
(to avoid introducing another parameter), but certainly it cannot be 
larger than the running time of Decy. 

As we did for locally (unique-)decodable codes, we can define a 
local 6-list-correcting algorithm, where Decg should recover arbitrary 
symbols of the codeword f rather than the message f. In this case, 
we don’t require that for all j, Dec3(-,aj) is a codeword, or that it is 
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close to g; in other words, some of the ajs may be junk. Analogously 
to Lemma 7.39, a local 6-list-correcting algorithm implies local 
6-list-decoding if the code is systematic. 

Proposition 7.36 shows how locally decodable codes convert 
functions that are hard in the worst case to ones that are hard on 
average. The same is true for local list-decoding: 


Proposition 7.55. Let Enc be an error-correcting code with local 
d-list-decoding algorithm (Dec;,Dec2) where Deco runs in time at 
most tpec, and let f be worst-case hard for non-uniform time t. Then 
f =Enc(f) is (t/,6) average-case hard, where t = t/tpec. 


Proof. Suppose for contradiction that f is not (t’,6)-hard. Then some 
nonuniform algorithm A running in time t computes f with error prob- 
ability smaller than 6. But if Enc has a local 6 list-decoding algorithm, 
then (with A playing the role of g) that means there exists a; (one 
of the possible outputs of Dec‘!), such that Dec$!(-,a;) computes f(-) 
everywhere. Hardwiring a; as advice, Dec#(-,a;) is a nonuniform 


algorithm running in time at most time(A) - time(Decg) < t. o 


Note that, in contrast to Proposition 7.36, here we are using 
nonuniformity more crucially, in order to select the right function from 
the list of possible decodings. As we will discuss in Section 7.7.1, this 
use of nonuniformity is essential for “black-box” constructions, that do 
not exploit any structure in the hard function f or the adversary (A in 
the above proof). However, there are results on hardness amplification 
against uniform algorithms, which use structure in the hard function 
f (e.g., that it is complete for a complexity class like E or NP) to 
identify it among the list of decodings without any nonuniform advice. 


7.6.3 Local List-Decoding Reed-Muller Codes 


Theorem 7.56. There is a universal constant c such that the q-ary 
Reed-Muller code of degree d and dimension m over can be locally 
(1 — e)-list-corrected in time poly(q, m) for € = cy/d/q. 
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Note that the distance at which list decoding can be done 
approaches 1 as g/d— oo. It matches the bound for list-decoding 
Reed-Solomon codes (Theorem 5.19) up to the constant c. Moreover, 
as the dimension m increases, the running time of the decoder 
(poly(q,m)) becomes much smaller than the block length (q” - logq), 
at the price of a reduced rate (24) /q”). 


Proof. Suppose we are given an oracle g : F” — F that is (1 — £) close 
to some unknown polynomial p : F™ — F, and that we are given an x € 
F™. Our goal is to describe two algorithms, Dec, and Dec2, where Deco 
is able to compute p(x) using a piece of Dec,’s output (i.e., advice). 

The advice that we will give to Dec2 is the value of p on a single 
point. Dec; can easily generate a (reasonably small) list that contains 
one such point by choosing a random y € F™, and outputting all pairs 
(y,z), for z € F. More formally: 


Algorithm 7.57 (Reed-Muller Local List-Decoder Dec;). 
Input: An oracle g: F” — F and a degree parameter d 


(1) Choose y È F™ 
(2) Output {(y,z) : zE F} 


This first-phase decoder is rather trivial in that it doesn’t make use 
of the oracle access to the received word g. It is possible to improve 
both the running time and list size of Dec; by using oracle access to g, 
but we won’t need those improvements below. 

Now, the task of Decg is to calculate p(x), given the value of p 
on some point y. Decg does this by looking at g restricted to the line 
through x and y, and using the list-decoding algorithm for Reed- 
Solomon Codes to find the univariate polynomials q1,q2,...,q that are 
close to g. If exactly one of these polynomials q; agrees with p on the 
test point y, then we can be reasonably confident that q;(x) = p(x). 

In more detail, the decoder works as follows: 


Algorithm 7.58 (Reed-Muller Local List-Corrector Dec2). 
Input: An oracle g: F” —> F, an input x € F”, advice (y,z) € F” x F, 
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(1) Let l= lr y—x : F > F” be the line through x and y (so that 
(0) = a and &(1) = y). 

(2) Run the (1 — ¢/2)-list-decoder for Reed-Solomon Codes 
(Theorem 5.19) on gle to get all univariate polys q,...,% 
that agree with g|, in greater than an ¢/2 fraction of points. 

(3) If there exists a unique i such that q;(1) = z, output q;(0). 
Otherwise, fail. 


Now that we have fully specified the algorithms, it remains to 
analyze them and show that they decode p correctly. Observe that it 
suffices to compute p on greater than an 11/12 fraction of the points 
x, because then we can apply the unique local correcting algorithm of 
Theorem 7.42. Therefore, to finish the proof of the theorem we must 
prove the following. 


Claim 7.59. Suppose that g:F” — F has agreement greater than € 
with a polynomial p: F™ — F of degree at most d. For at least half 
of the points y € F” the following holds for greater than an 11/12 
fraction of lines Z going through y: 


(1) agr(gle,ple) > €/2. 
(2) There does not exist any univariate polynomial q of degree 
at most d other than p| such that agr(g|,q) >¢/2 and 


q(y) = ply). 


Proof of Claim: It suffices to show that Items 7.59 and 7.59 hold 
with probability 0.99 over the choice of a random point y È F™ and 
a random line @ through y; then we can apply Markov’s inequality to 
finish the job. 

Item 7.59 holds by pairwise independence. If the line £ is chosen 
randomly, then the q points on £ are pairwise independent samples 
of F”. The expected agreement between g|¢ and pl, is simply the 
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agreement between g and p, which is greater than £ by hypothesis. So 
by the Pairwise-Independent Tail Inequality (Prop. 3.28), 


Prlagr(gleple) < £/2] < TEO 


which can be made smaller than 0.01 for a large enough choice of the 
constant c in € = cy/dfq. 

To prove Item 7.59, we imagine first choosing the line @ uniformly 
at random from all lines in F”, and then choosing y uniformly at 
random from the points on £ (reparameterizing £ so that (1) = y). 
Once we choose £, we can let qi,...,q¢ be all polynomials of degree 
at most d, other than p|z, that have agreement greater than ¢/2 with 
gle. (Note that this list is independent of the parametrization of £, 
ie, if V(x) = ¢(ar+ 6) for a#0 then plw and g(x) = qilax + b) 
have agreement equal to agr(plz,qi).) By the list-decodability of 
Reed-Solomon Codes (Proposition 5.15), we have t = O(,/q/d). 

Now, since two distinct polynomials can agree in at most d points, 
when we choose a random point y È £, the probability that q; and 
p agree at y is at most d/g. After reparameterization of Z so that 
0(1) = y, this gives 


2 d d 
Pr|di : q(1)=p(1)] ate -=0 -]. 
y q q 
This can also be made smaller than 0.01 for large enough choice 
of the constant c (since we may assume q/d > c?, else £ > 1 and the 
result holds vacuously). o 


7.6.4 Putting it Together 


To obtain a locally list-decodable (rather than list-correctible) code, 
we again use the low-degree extension (Lemma 7.48) to obtain a 
systematic encoding. As before, to encode messages of length £ = log L, 
we apply Lemma 7.48 with |H| = [vq] and m = [£/log|H|], for total 
degree d< Ja: £. To decode from a 1 — £e fraction of errors using 
Theorem 7.56, we need cy/d/q < e, which follows if q > c??/e*. This 
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yields the following locally list-decodable codes: 


Theorem 7.60. For every L € N and e > 0, there is an explicit code 
Enc: {0,1}4 > ©”, with blocklength L= poly(Z,1/e) and alphabet 
size |£| = poly(log L,1/e), that has a local (1 — ¢)-list-decoder running 
in time poly(log Z,1/e). 


Concatenating the code with a Hadamard code, similarly to 
Problem 5.2, we obtain: 


Theorem 7.61. For every L € N and e > 0, there is an explicit code 
Enc: {0,1}4 > {0,1}" with blocklength Ê = poly(L,1/e) that has a 
local (1/2 — e)-list-decoder running in time poly(log L,1/e). 


Using Proposition 7.55, we get the following hardness amplification 
result: 


Theorem 7.62. For s:N — N, suppose that there is a function 
f: {0,1} > {0,1} in E that is worst-case hard against nonuniform 
time s(@), where s(¢) is computable in time 2°, then there exists 
f: {0,1}° — {0,1} in E that is (1/2 —1/s'(0)) average-case hard 
against (non-uniform) time s/(@) for s/(@) = t(@€)2™ /poly(é). 


Combining this with Theorem 7.19 and Corollary 7.20, we get: 


Theorem 7.63. For s : N— N, suppose that there is a function f € E 
such that for every input length £ € N, f is worst-case hard for nonuni- 
form time s(@), where s(¢) is computable in time 2°. Then for every 
m €N, there is a mildly explicit (m,1/m) pseudorandom generator G : 
{0,1}%™ — {0,1} with seed length d(m) = O(s~!(poly(m))?/logm). 


Corollary 7.64. For s:N — N, suppose that there is a function 
f €E=DTIME(20™) such that for every input length £ €N, f is 
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worst-case hard for nonuniform time s(¢). Then: 
(1) If s(@) = 2°, then BPP =P. 


(2) If s(¢)=2@, then BPP cP. 
(3) If s(@) = &), then BPP c SUBEXP. 


We note that the hypotheses in these results are simply asserting 
that there are problems in E of high circuit complexity, which is quite 
plausible. Indeed, many common NP-complete problems, such as 
SAT, are in E and are commonly believed to have circuit complexity 
2° on inputs of length £ (though we seem very far from proving it). 
Thus, we have a “win-win” situation, either we can derandomize all 
of BPP or SAT has significantly faster (nonuniform) algorithms than 
currently known. 

Problem 7.1 establishes a converse to Theorem 7.63, showing that 
pseudorandom generators imply circuit lower bounds. The equivalence 
is fairly tight, except for the fact that Theorem 7.63 has seed length 
d(m) = O(s~1(poly(m))?/logm) instead of d(m) = O(s~+(poly(m))). 
It is known how to close this gap via a different construction, which 
is more algebraic and constructs PRGs directly from worst-case hard 
functions (see the Chapter Notes and References); a (positive) solution 
to Open Problem 7.25 would give a more modular and versatile 
construction. For Corollary 7.64, however, there is only a partial 
converse known. See Section 8.2.2. 


Technical Comment. Consider Item 3 of Corollary 7.64, which 
assumes that there is a problem in E of superpolynomial circuit 
complexity. This sounds similar to assuming that E ¢ P/poly (which 
is equivalent to EXP ¢ P/poly, by Problem 7.2). However, the latter 
assumption is a bit weaker, because it only guarantees that there is a 
function f € E and a function s(¢) = ÆC) such that f has complexity 
at least s(¢) for infinitely many input lengths @. Theorem 7.63 and 
Corollary 7.64 assume that f has complexity at least s(¢) for all £; 
equivalently f is not in i.o.-P/poly, the class of functions that are 
computable by poly-sized circuits for infinitely many input lengths. We 
need the stronger assumptions because we want to build generators 
G : {0,1}4™ — {0,1} that are pseudorandom for all output lengths 
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m, in order to get derandomizations of BPP algorithms that are 
correct on all input lengths. However, there are alternate forms of 
these results, where the “infinitely often” is moved from the hypothesis 
to the conclusion. For example, if E ¢ P/poly, we can conclude 
that BPP Ci.o-SUBEXP, where i.o-SUBEXP denotes the class 
of languages having deterministic subexponential-time algorithms 
that are correct for infinitely many input lengths. Even though these 
“infinitely often” issues need to be treated with care for the sake of 
precision, it would be quite unexpected if the complexity of problems 
in E and BPP oscillated as a function of input length in such a 
strange way that they made a real difference. 


7.7 Connections to Other Pseudorandom Objects 


7.7.1 Black-Box Constructions 


Similarly to our discussion after Theorem 7.19, the pseudorandom 
generator construction in the previous section is very general. The con- 
struction shows how to take any function f : {0,1} > {0,1} and use it 
as a subroutine (oracle) to compute a generator Gf : {0,1}¢ —> {0,1}™ 
whose pseudorandomness can be related to the hardness of f. The 
only place that we use the fact that f € E is to deduce that Gf is 
computable in E. The reduction proving that Gf is pseudorandom is 
also very general. We showed how to take any T that distinguishes 
the output of Gf (U4) from Um and use it as a subroutine (oracle) 
to build an efficient nonuniform algorithm Red such that Red? 
computes f. The only place that we use the fact that T is itself an 
efficient nonuniform algorithm is to deduce that Red’ is an efficient 
nonuniform algorithm, contradicting the worst-case hardness of f. 

Such constructions are called “black box,” because they treat the 
hard function f and the distinguisher T as black boxes (i.e., oracles), 
without using the code of the programs that compute f and T. 
As we will see, black-box constructions have significant additional 
implications. Thus, we formalize the notion of a black-box construction 
as follows: 


Definition 7.65. Let G/:[D]—[M] be a deterministic algorithm 
that is defined for every oracle f :[L] — {0,1}, let t,k be positive 
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integers such that k < t, and let € >0. We say that G is a (t,k,e) 
black-box PRG construction if there is a randomized oracle algorithm 
Red, running in time t, such that for every f:[L] — {0,1} and 
T : [M] > {0,1} such that if 


Pr[T(Gf(Ujp))) = 1] — Pr[T (Um) = 1] > €, 
then there is an advice string z € |K] such that 
Vae€[L] Pr[Red?(x,z) = f(x)] > 2/3, 


where the probability is taken over the coin tosses of Red. 


Note that we have separated the running time t of Red and the 
length k of its nonuniform advice into two separate parameters, and 
assume k <t since an algorithm cannot read more than k bits in 
time t. When we think of Red as a nonuniform algorithm (like a 
boolean circuit), then we may as well think of these two parameters as 
being equal. (Recall that, up to polylog(s) factors, being computable 
by a circuit of size s is equivalent to being computable by a uniform 
algorithm running in time s with s bits of nonuniform advice.) 
However, separating the two parameters is useful in order to isolate 
the role of nonuniformity, and to establish connections with the other 
pseudorandom objects we are studying.’ 

We note that if we apply a black-box pseudorandom generator 
construction with a function f that is actually hard to compute, then 
the result is indeed a pseudorandom generator: 


Proposition 7.66. If G is a (t,k,e) black-box PRG construction 
and f has nonuniform worst-case hardness at least s, then Gf is an 
(s/O(t),¢) pseudorandom generator. 


“Sometimes it is useful to allow the advice string z to also depend on the coin tosses of 
the reduction Red. By error reduction via r = O(£) repetitions, such a reduction can be 
converted into one satisfying Definition 7.65 by sampling r = O(£) sequences of coin tosses, 
but this blows up the advice length by a factor of r, which may be too expensive. 
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Now, we can rephrase the pseudorandom generator construction of 
Theorem 7.63 as follows: 


Theorem 7.67. For every constant y>0, and every b£&,m EN, 
and every ¢>0, there is a (t,k,¢) black-box PRG construction 
Gf : {0,1}? — {0,1} that is defined for every oracle f:{0,1}“ > {0,1}, 
with the following properties: 


(1) (Mild) explicitness: Gf is computable in uniform time 
poly(m,2°) given an oracle for f. 

(2) Seed length: d= O((€ + log(1/e))?/logm). 

(3) Reduction running time: t = poly(m,1/e). 

(4) Reduction advice length: k = m!*7 + O(€ + log(m/e)). 


In addition to asserting the black-box nature of Theorem 7.63, the 
above is more general in that it allows € to vary independently of m 
(rather than setting € = 1/m), and gives a tighter bound on the length 
of the nonuniform advice than just t = poly(m,1/e). 


Proof Sketch: Given a function f, Gf encodes f in the locally 
list-decodable code of Theorem 7.61 (with decoding distance 1/2 — e” 
for e' =e/m) to obtain f : {0,1} > {0,1} with ê= O(é + log(1/e)), 
and then computes the Nisan—Wigderson generator based on f 
(Construction 7.23) and a (£,ylogm) design. The seed length and 
mild explicitness follow from the explicitness and parameters of the 
design and code (Lemma 7.22 and Theorem 7.61). The running time 
and advice length of the reduction follow from inspecting the proofs 
of Theorems 7.61 and 7.19. Specifically, the running time of of the 
Nisan—Wigderson reduction in the proof of Theorem 7.19 is poly(m) 
(given the nonuniform advice) by inspection, and the running time of 
the local list-decoding algorithm is poly(@,1/e’) < poly(m,1/e). (We 
may assume that m > £, otherwise Gf need not have any stretch, and 
the conclusion is trivial.) The length of the advice from the locally 
list-decodable code consists of a pair (y,z) € F” x F, where F is a 
field of size q=poly(é,1/e) and vlog|F| = = O(é+ log(1/e)). The 
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Nisan—Wigderson reduction begins with the distinguisher-to-predictor 
reduction of Proposition 7.16, which uses logm bits of advice to 
specify the index 7 at which the predictor works and m — i — 1 bits 
for hardwiring the bits fed to the distinguisher in positions 7,...,m. In 
addition, for j =1,...,i — 1, the Nisan—Wigderson reduction nonuni- 
formly hardwires a truth-table for the function f;(y) which depends on 
the at most y - logm bits of y selected by the intersection of the ith and 
jth sets in the design. These truth tables require at most (i — 1) - m7 
bits of advice. In total, the amount of advice used is at most 


O(é + log(1/e)) + m—-—i-1+(i-1)-m’? 
= m! + O(€ + log(1/e)). o 


One advantage of a black-box construction is that it allows us to 
automatically “scale up” the pseudorandom generator construction. 
If we apply the construction to a function f that is not necessarily 
computable in E, but in some higher complexity class, we get a 
pseudorandom generator Gf computable in an analogously higher 
complexity class. Similarly, if we want our pseudorandom generator 
to fool tests T computable by nonuniform algorithms in some higher 
complexity class, it suffices to use a function f that is hard against an 
analogously higher class. 

For example, we get the following “nondeterministic” analogue of 
Theorem 7.63: 


Theorem 7.68. For s: N —> N, suppose that there is a function f € 
NE N co-NE such that for every input length £ € N, f is worst-case 
hard for nonuniform algorithms running in time s(¢) with an NP 
oracle (equivalently, boolean circuits with SAT gates), where s(£) is 
computable in time 2°. Then for every m € N, there is a pseudo- 
random generator G: {0,1}“™ — {0,1} with seed length d(m) = 
O(s~!(poly(m))?/logm) such that G is (m,1/m)-pseudorandom 
against nonuniform algorithms with an NP oracle, and G is computable 
in nondeterministic time 20(“) (meaning that there is a nondetermin- 
istic algorithm that on input x, outputs G(x) on at least one computa- 
tion path and outputs either G(x) or “fail” on all computation paths). 
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The significance of such generators is that they can be used for 
derandomizing AM, which is a randomized analogue of NP, defined 
as follows: 


Definition 7.69. A language L is in AM iff there is a probabilistic 
polynomial-time verifier V and polynomials m(n),p(n) such that for 
all inputs x of length n, 


reL => Pr [Ay € {0,1 P™® V(x,r,y) = 1] > 2/3, 
rÈ{0, 1} 


céL> Pr [aye {0,1}? V(2,r,y) = 1] < 1/3. 


r&{0,1}™() 


— 


Another (non-obviously!) equivalent definition of AM is the 
class of languages having constant-round interactive proof systems, 
where a computationally unbounded prover (“Merlin”) can convince 
a probabilistic polynomial-time verifier (“Arthur”) that an input 
x is in L through an interactive protocol with of O(1) rounds of 
polynomial-length communication. 

Graph Nonisomorphism is the most famous example of a language 
that is in AM but is not known to be in NP. Nevertheless, using The- 
orem 7.68 we can give evidence that Graph Nonisomorphism is in NP. 


Corollary 7.70. If there is a function f € NE N co-NE that, on 
inputs of length £, is worst-case hard for nonuniform algorithms 
running in time 2° with an NP oracle, then AM = NP. 


While the above complexity assumption may seem very strong, 
it is actually known to be weaker than the very natural assump- 
tion that exponential time E = DTIME(20) is not contained in 
subexponential space N-s9) DSPACE(2°"). 

As we saw in Section 7.4.3 on derandomizing constant-depth 
circuits, black-box constructions can also be “scaled down” to apply 
to lower complexity classes, provided that the construction G and/or 
reduction Red can be shown to be computable in a lower class 
(e.g., AC®). 
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7.7.2 Connections to Other Pseudorandom Objects 


At first, it may seem that pseudorandom generators are of a different 
character than the other pseudorandom objects we have been studying. 
We require complexity assumptions to construct pseudorandom gen- 
erators, and reason about them using the language of computational 
complexity (referring to efficient algorithms, reductions, etc.). The 
other objects we have been studying are all information-theoretic in 
nature, and our constructions of them have been unconditional. 

The notion of black-box constructions will enable us to bridge 
this gap. Note that Theorem 7.67 is unconditional, and we will see 
that it, like all black-box constructions, has an information-theoretic 
interpretation. Indeed, we can fit black-box pseudorandom generator 
constructions into the list-decoding framework of Section 5.3 as follows: 


Construction 7.71. Let Gf : [|D] — [M] be an algorithm that is 
defined for every oracle f : [n] > {0,1}. Then, setting N = 2”, define 
r: [N] x [D] > [M], by 

r(Y) =G (y), 


where we view the truth table of f as an element of [N] = {0,1}”. 


It turns out that if we allow the reduction unbounded running time 
(but still bound the advice length), then pseudorandom generator 
constructions have an exact characterization in our framework: 


Proposition 7.72. Let Gf and T be as in Construction 7.71. Then 
Gf is an (00,k,e) black-box PRG construction iff for every T C [M], 
we have 


where K = 2., 


Proof. 


=. Suppose Gf is an (00,k,e) black-box PRG construction. Then f 
is in LISTr(T,u(T) + £) iff T distinguishes Gf (Uin) from Uj) with 
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advantage greater than e. This implies that there exists a z € [K] 
such that Red? (-,z) computes f everywhere. Thus, the number of 
functions f in LISTp(T,u(T) + £) is bounded by the number of advice 
strings z, which is at most K. 

<. Suppose that for every T C [M], we have L = |LISTp(T,u(T) + 
e)| < K. Then we can define Red’ (2,z) = f.(x), where f1,..., fg are 
any fixed enumeration of the elements of LISTp(T,, p(T) + €). 


Notice that this characterization of black-box PRG constructions 
(with reductions of unbounded running time) is the same as the one 
for averaging samplers (Proposition 5.30) and randomness extractors 
(Proposition 6.23). In particular, the black-box PRG construction 
of Theorem 7.67 is already a sampler and extractor of very good 
parameters: 


Theorem 7.73. For every constant y > 0, every n € N, k € [0,n], and 
every € > 0, there is an explicit (k,e) extractor Ext : {0,1}” x {0,1}¢ > 
{0,1} with seed length d= O/(log?(n/e)/logk) and output length 
m > kt’. 


Proof Sketch: Without loss of generality, assume that n is a power 
of 2, namely n = 2f = L. Let Gf (y) : {0,1}4 > {0,1}™ be the (t, ko, €o) 
black-box PRG construction of Theorem 7.67 which takes a function 
f:{0,1} — {0,1} and has ko =m! + O(l + log(m/eo)), and let 
Ext(f,y) =T(f,y)= Gf (y). By Propositions 7.72 and 6.23, Ext is a 
(ko + log(1/eo),2€0) extractor. Setting € = 2€9 and k = ko + log(1/éo), 
we have a (k,¢) extractor with output length 


m = (k — OU + log(m/e)))+~7 > kt™7 — O(€ + log(m/e)). 


We can increase the output length to k!~7 by increasing the seed 
length by O(£ + log(m/e)). The total seed length then is 


(£ + log(1/e))? _ {los (n/e) 
d= o (eet + £ + log(m/e)) o( ieee i 


Oo 


The parameters of Theorem 7.73 are not quite as good as those of 
Theorem 6.36 and Corollary 6.39, as the output length is k'~7 rather 


268  Pseudorandom Generators 


than (1 — y)k, and the seed length is only O(logn) when k = n2), 
However, these settings of parameters are already sufficient for many 
purposes, such as the simulation of BPP with weak random sources. 
Moreover, the extractor construction is much more direct than that of 
Theorem 6.36. Specifically, it is 


Ext(f,y) = (f(ylsi)s--sf@lsn))s 


where f is an encoding of f in a locally list-decodable code and 
S1,...,5m are a design. In fact, since Proposition 7.72 does not depend 
on the running time of the list-decoding algorithm, but only the 
amount of nonuniformity, we can use any (1/2 — ¢/2m,poly(m/e)) 
list-decodable code, which will only require an advice of length 
O(log(m/e)) to index into the list of decodings. In particular, we can 
use a Reed-Solomon code concatenated with a Hadamard code, as in 
Problem 5.2. 

We now provide some additional intuition for why black-box 
pseudorandom generator constructions are also extractors. A black- 
box PRG construction Gf is designed to use a computationally 
hard function f (plus a random seed) to produce an output that is 
computationally indistinguishable from uniform. When we view it as 
an extractor Ext(f,y) =G/(y), we instead are feeding it a function 
f that is chosen randomly from a high min-entropy distribution (plus 
a random seed). This can be viewed as saying that f is information- 
theoretically hard, and from this stronger hypothesis, we are able 
to obtain the stronger conclusion that the output is statistically 
indistinguishable from uniform. The information-theoretic hardness 
of f can be formalized as follows: if f is sampled from a source F 
of min-entropy at least k + log(1/e), then for every fixed function A 
(such as A= Red’), the probability (over f + F) that there exists 
a string z of length k such that A(-,z) computes f everywhere is at 
most £. That is, a function generated with min-entropy larger than 
k is unlikely to have a description of length k (relative to any fixed 
“interpreter” A). 

Similarly to black-box PRG constructions, we can also discuss 
converting worst-case hard functions to average-case hard functions in 
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a black-box manner: 


Definition 7.74. Let Amp” : |D] > [q] be a deterministic algorithm 
that is defined for every oracle f : [n] > {0,1}. We say that Amp is a 
(t,k,€) black-box worst-case-to-average-case hardness amplifier if there 
is a probabilistic oracle algorithm Red, called the reduction, running 
in time t such that for every function g : [D] —> [q] such that 


Pr[g(Uip)) = Amp! (Ujp))] > 1/4 + e, 
there is an advice string z € [K], where K = 2", such that 
Va E fn] Pr(Red?(@,2) =f (a)| > 2/3, 


where the probability is taken over the coin tosses of Red. 


Note that this definition is almost identical to that of a locally 
(1 — 1/q — )-list-decodable code (Definition 7.54), viewing Amp/ as 
Enc(f), and Red? as Dec§. The only difference is that in the definition 
of locally list-decodable code, we require that there is a first-phase 
decoder Dec? that efficiently produces a list of candidate advice strings 
(a property that is natural from a coding perspective, but is not 
needed when amplifying hardness against nonuniform algorithms). If 
we remove the constraint on the running time of Red, we simply obtain 
the notion of a (1 — 1/q — ¢,K) list-decodable code. By analogy, we 
can view black-box PRG constructions (with reductions of bounded 
running time) as simply being extractors (or averaging samplers) with 
a kind of efficient local list-decoding algorithm (given by Red, again 
with an advice string that need not be easy to generate). 

In addition to their positive uses illustrated above, black-box reduc- 
tions and their information-theoretic interpretations are also useful for 
understanding the limitations of certain proof techniques. For example, 
we see that a black-box PRG construction G : {0,1}4 > {0,1} must 
have a reduction that uses k > m — d—log(1/e) — 1 bits of advice. 
Otherwise, by Propositions 7.72 and 6.23, we would obtain a (k,2e) 
extractor that outputs m almost-uniform bits when given a source 
of min-entropy less than k — d — 1, which is impossible if € < 1/4. 
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Indeed, notions of black-box reduction have been used in other settings 
as well, most notably to produce a very fine understanding of the 
relations between different cryptographic primitives, meaning which 
ones can be constructed from each other via black-box constructions. 


7.8 Exercises 


Problem 7.1 (PRGs imply hard functions). Suppose that 
for every m, there exists a mildly explicit (m,1/m)_ pseudo- 
random generator Gm : {0,1}4”) — {0,1}. Show that E has 
a function f : {0,1}‘— {0,1} with nonuniform worst-case hard- 
ness t(€)=Q(d~'(€—1)). In particular, if d(m) = O(logm), then 
t(£) = 2° (Hint: look at a prefix of Gs output.) 


Problem 7.2 (Equivalence of lower bounds for EXP and 
E). Show that E contains a function f : {0,1} — {0,1} of cir- 
cuit complexity œl) if and only if EXP does. (Hint: consider 
"(a1 +++ be) = f(w1-++x4e).) 

Does the same argument work if we replace @?™) with 2°? How 
about 2°? 


Problem 7.3 (Limitations of Cryptographic Generators). 


(1) Prove that a cryptographic pseudorandom generator cannot 
have seed length d(m) = O(logm). 

(2) Prove that cryptographic pseudorandom generators (even 
with seed length d(m) =m — 1) imply NP ¢ P/poly. 

(3) Note where your proofs fail if we only require that G is an 
(m°,1/m*°) pseudorandom generator for a fixed constant c. 


Problem 7.4 (Deterministic Approximate Counting). Using 
the PRG for constant-depth circuits of Theorem 7.29, give deter- 
ministic quasipolynomial-time algorithms for the problems below. 
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(The running time of your algorithms should be 2P°Y (logn,log(1/e)) 
where n is the size of the circuit/formula given and € is the accuracy 
parameter mentioned.) 


(1) Given a constant-depth circuit C and € > 0, approximate 
the fraction of inputs x such that C(x) =1 to within an 
additive error of €. 

(2) Given a DNF formula y and € > 0, approximate the number 
of assignments x such that y(x) = 1 to within a multiplica- 
tive fraction of (1+ €). You may restrict your attention to 
y in which all clauses contain the same number of literals. 
(Hint: Study the randomized DNF counting algorithm of 
Theorem 2.34.) 


Note that these are not decision problems, whereas classes such as 
BPP and BPACp are classes of decision problems. One of the points 
of this problem is to show how derandomization can be used for other 
types of problems. 


Problem 7.5 (Strong Pseudorandom Generators). By analogy 
with strong extractors, call a function G: {0,1}4 — {0,1}™ a (t,e) 
strong pseudorandom generator iff the function G’(x) = (x,G(x)) is a 
(t,£) pseudorandom generator. 


(1) Show that there do not exist strong cryptographic pseudo- 
random generators. 

(2) Show that the Nisan—Wigderson generator (Theorem 7.24) 
is a strong pseudorandom generator. 

Suppose that for all constants a> 0, there is a strong 
and fully explicit (m,e(m)) pseudorandom generator 
G: {0,1} > {0,1}™. Show that for every language 
L € BPP, there is a deterministic polynomial-time algo- 
rithm A such that for all n, Pr E rope l4) Xela) < 


1/2” + e(poly(n)). That is, we get a polynomial-time 


(3 


YS 


average-case derandomization even though the seed length 
of G is d(m) = m®. 
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(4) Show that for every language L € BPAC®, there is an AC? 
algorithm A such that PLR 6 nl Ae) # xL(£)| < 1/n. 


(Warning: be careful about error reduction.) 


Problem 7.6 (Private Information Retrieval). The goal of pri- 
vate information retrieval is for a user to be able to retrieve an entry of 
a remote database in such a way that the server holding the database 
learns nothing about which database entry was requested. A trivial 
solution is for the server to send the user the entire database, in which 
case the user does not need to reveal anything about the entry desired. 
We are interested in solutions that involve much less communication. 
One way to achieve this is through replication. Formally, in a q-server 
private information-retrieval (PIR) scheme, an arbitrary database 
D € {0,1}” is duplicated at q noncommunicating servers. On input an 
index 7 € [n], the user algorithm U tosses some coins r and outputs 
queries (x1,..., £q) = U (i,r), and sends x; to the jth server. The jth 
server algorithm S; returns an answer yj = S;(7;,D). The user then 
computes its output U(i,r,x1,...,%q), which should equal D;, the ith 
bit of the database. For privacy, we require that the distribution of 
each query x; (over the choice of the random coin tosses r) is the same 
regardless of the index 7 being queried. 

It turns out that q-query locally decodable codes and q-server PIR 
are essentially equivalent. This equivalence is proven using the notion 
of smooth codes. A code Enc : {0,1}" > ©” is a q-query smooth code 
if there is a probabilistic oracle algorithm Dec such that for every 
message x and every i € [n], we have Pr[Dec’"°) (i) = x;] = 1 and Dec 
makes q nonadaptive queries to its oracle, each of which is uniformly 
distributed in [fi]. Note that the oracle in this definition is a valid 
codeword, with no corruptions. Below you will show that smooth 
codes imply locally decodable codes and PIR schemes; converses are 
also known (after making some slight relaxations to the definitions). 


8 Another way is through computational security, where we only require that it be compu- 
tationally infeasible for the database to learn something about the entry requested. 
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(1) Show that the decoder for a q-query smooth code is also a 
local (1/3q)-decoder for Enc. 

(2) Show that every g-query smooth code Enc: {0,1}" > d” 

gives rise to a q-server PIR scheme in which the user and 

servers communicate at most q- (logi + log|x|) bits for 

each database entry requested. 

Using the Reed—Muller code, show that there is a polylog(n)- 


— 
Ww 
x 


server PIR scheme with communication complexity 
polylog(n) for n-bit databases. That is, the user and servers 
communicate at most polylog(n) bits for each database 
entry requested. (For constant q, the Reed-Muller code with 
an optimal systematic encoding as in Problem 5.4 yields a 
q-server PIR with communication complexity O(n*/(470).) 


Problem 7.7 (Better Local Decoding of Reed-Muller Codes). 
Show that for every constant ¢ > 0, there is a constant y > 0 such that 
there is a local (1/2 — ¢)-decoding algorithm for the q-ary Reed-Muller 
code of degree d and dimension m, provided that d < yq. (Here we are 
referring to unique decoding, not list decoding.) The running time of 
the decoder should be poly(m,q). 


Problem 7.8 (Hitting-Set Generators). A set Hm C {0,1 }” isa 
(t,£) hitting set if for every nonuniform algorithm T running in time 
t that accepts greater than an € fraction of m-bit strings, T accepts at 
least one element of Hm. 


(1) Show that if, for every m, we can construct an 
(m,1/2) hitting set Hm in time s(m)>m, then 
RP c U. DTIME(s(n°)). In particular, if s(m) = poly(m), 
then RP =P. 

(2) Show that if there is a (t,£) pseudorandom generator 
Gm : {0,1}% > {0,1 }” computable in time s, then there is a 
(t,£) hitting set Hm constructible in time 2% . s. 
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(3) Show that if, for every m, we can construct an (m,1/2) hit- 
ting set Hm in time s(m) = poly(m), then BPP =P. (Hint: 
this can be proven in two ways. One uses Problem 3.1 and 
the other uses a variant of Problem 7.1 together with Corol- 
lary 7.64. How do the parameters for general s(m) compare?) 

(4) Define the notion of a (t,k,e) black-box construction of 
hitting set-generators, and show that, when t= ov, such 
constructions are equivalent to constructions of dispersers 
(Definition 6.19). 


Problem 7.9 (PRGs versus Uniform Algorithms => Average- 
Case Derandomization). For functions t: N —> N and e : N > [0,1], 
we say that a sequence {Gm : {0,1}% — {0,1}'"} of is a (t(m),e(m)) 
pseudorandom generator against uniform algorithms iff the ensem- 
bles {G(Uaim))}men and {Um}men are uniformly computationally 
indistinguishable (Definition 7.2). 

Suppose that we have a mildly explicit (m,1/m) pseudorandom gen- 
erator against uniform algorithms that has seed length d(m). Show that 
for every language L in BPP, there exists a deterministic algorithm 
A running in time 240y(") . poly(n) on inputs of length n such that: 


(1) Pr[A(X,) = L(X,)] > 1 — 1/n?, where X, & {0,1}” and 
L(-) is the characteristic function of L. (The exponent of 
2 in n? is arbitrary, and can be replaced by any constant.) 
Hint: coming up with the algorithm A is the “easy” part; 
proving that it works well is a bit trickier. 

(2) Pr[A(X,) = L(X,)] > 1 — 1/n?, for any random variable 


Xn distributed on {0,1}” that is samplable in time n?. 


Problem 7.10 (PRGs are Necessary for Derandomization). 


(1) Call a function G : {0,1}4 > {0,1}™ a (t,£,€) pseudorandom 
generator against bounded-nonuniformity algorithms iff for 
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every probabilistic algorithm T that has a program of length 
at most l and that runs in time at most t on inputs of 
length n, we have 


| Pr[7(G(Ua)) = 1] — Pr[T(Um) = 1]| < e. 


Consider the promise problem II whose YES instances 
are truth tables of functions G : {0,1}4— {0,1}™ that are 
(m,logm,1/m) pseudorandom generators against bounded- 
nonuniformity algorithms, and whose NO instances are 
truth tables of functions that are not (m,logm,2/m) 
pseudorandom generators against bounded-nonuniformity 
algorithms. (Here m and d are parameters determined by 
the input instance G.) Show that II is in prBPP. 

(2) Using Problem 2.11, show that if prBPP=prP, then 
there is a mildly explicit (m,1/m) pseudorandom generator 
against uniform algorithms with seed length O(logm). 
(See Problem 7.9 for the definition. It turns out that the 
hypothesis prBPP = prP here can be weakened to obtain 
an equivalence between PRGs vs. uniform algorithms and 
average-case derandomization of BPP.) 


Problem 7.11 (Composition). For simplicity in this problem, only 
consider constant t in this problem (although the results do have 
generalizations to growing t = t(¢)). 


(1) Show that if f : {0,1} > {0,1} is a one-way permutation, 
then for any constant t, f is a one-way permutation, where 


fOr) = FF). 
——v/ 


(2) Show that the above fails for one-way functions. That 
is, assuming that there exists a one-way function g, con- 
struct a one-way function f which doesn’t remain one 
way under composition. (Hint: for |z| = |y| = 2/2, set 
f(x,y) = 119g(y) unless x € {0°,1°}.) 
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(3) Show that if G is a cryptographic pseudorandom generator 


with seed length d(m) = m®), then for any constant t, G 
is a cryptographic pseudorandom generator. Note where 
your proof fails for fully explicit pseudorandom generators 
against time mê for a fixed constant c. 


Problem 7.12 (Local List Decoding the Hadamard Code). For 
a function f: Z3 — Zo, A parameterized subspace x+ V of Zy of 


dimension d is given by a linear map V : Z4 > ZY and a shift x € Z% 


(We do not require that the map V be full rank.) We write V for 


0+V. For a function f :Z5’ > Ze, we define f|e+v : Zg — Zy by 
fletv(y) = f(x + V (y)). 


(1) Let c: Z} —> Z be a codeword in the Hadamard code 


NS 


er 


(i.e., a linear function), r: Z3’ — Z2 a received word, V a 
parameterized subspace of Z% of dimension d, and x € Z3. 
Show that if dy(rle+v,cle+v) < 1/2, then c(x) can be 
computed from x, V, cly, and oracle access to r in time 
poly(m, 2%) with 2% — 1 queries to r. 

Show that for every m € N and € > 0, the Hadamard code 
of dimension m has a (1/2 — €) local list-decoding algorithm 
(Dec1,Decz2) in which both Dec; and Decz run in time 
poly(m,1/e), and the list output by Dec; has size O(1/e?). 
(Hint: consider a random parameterized subspace V of 
dimension 2log(1/e) + O(1), and how many choices there 
are for cly.) 

Show that Decs can be made to be deterministic and run in 
time O(m). 


Problem 7.13 (Hardcore Predicates). A hardcore predicate for 


a one-way function f : {0,1} > {0,1} is a poly(@)-time computable 


function b: {0,1}£-+ {0,1} such that for every constant c, every 
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nonuniform algorithm A running in time £4, we have: 


for all sufficiently large @. Thus, while the one-wayness of f only 
guarantees that it is hard to compute all the bits of fs input from its 
output, b specifies a particular bit of information about the input that 
is very hard to compute (one can’t do noticeably better than random 


Pr[A(f(Ue)) = WU) < 5 + Ze 


guessing). 


(1) 


Let Enc: {0,1} > {0,1} be a code such that given 
a € {0,1} and y€ [Ê], Enc(x)y can be computed in time 
poly(@). Suppose that for every constant c and all sufficiently 
large £, Enc has a (1/2 — 1/€°) local list-decoding algorithm 
(Dec1,Dec2) in which both Dec; and Decz run in time 
poly(@). Prove that if f:{0,1}£— {0,1}* is a one-way 
function, then b(x,y) = Enc(x), is a hardcore predicate for 
the one-way function f’(z,y) = (f(x),y). 

Show that if b: {0,1} > {0,1} is a hardcore predicate for 
a one-way permutation f : {0,1}" > {0,1}*, then for every 
m = poly(é), the following function G : {0,1} > {0,1 }™ is 
a cryptographic pseudorandom generator: 


G(x) = (b(x),b(f(@)), OF (F(@))),--- BF" (@))). 


(Hint: show that G is “previous-bit unpredictable.” ) 

Using Problem 7.12, deduce that if f : {0,1} — {0,1} is 
a one-way permutation, then for every m = poly(£), the 
following is a cryptographic pseudorandom generator: 


CA) aI Os Ge) este § Or): 


Problem 7.14 (PRGs 


random variable X has (t,¢) pseudoentropy at least k if it is (t,e) 


indistinguishable from some random variable of min-entropy at least k. 


from 1-1 One-Way Functions). A 
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(1) Suppose that X has (t,£) pseudoentropy at least k and that 
Ext : {0,1}" x {0,1}? > {0,1}™ is a (k,e’)-extractor com- 
putable in time t’. Show that Ext(X,U,) is an (t — t,e + e’) 
indistinguishable from Um. 

(2) Let f : {0,1} — {0,1}” be a one-to-one one-way function 
(not necessarily length-preserving) and b : {0,1} > {0,1} a 
hardcore predicate for f (see Problem 7.13). Show that for 
every constant c and all sufficiently large £, the random vari- 
able f(Ue)b(Uc) has (£°,1/£°) pseudoentropy at least £ + 1. 

(3) (*) Show how to construct a cryptographic pseudorandom 
generator from any one-to-one one-way function. (Any seed 
length (m) < m is fine.) 


7.9 Chapter Notes and References 


Other surveys on pseudorandom generators and derandomization 
include [162, 209, 226, 288]. 

Descriptions of classical constructions of pseudorandom generators 
(e.g., linear congruential generators) and the batteries of statistical 
tests that are used to evaluate them can be found in [245, 341]. Linear 
congruential generators and variants were shown to be cryptographi- 
cally insecure (e.g., not satisfy Definition 7.9) in [56, 80, 144, 252, 374]. 
Current standards for pseudorandom generation in practice can be 
found in [51]. 

The modern approach to pseudorandomness described in this 
section grew out of the complexity-theoretic approach to cryptography 
initiated by Diffie and Hellman [117] (who introduced the concept 
of one-way functions, among other things). Shamir [360] constructed 
a generator achieving a weak form of unpredictability based on the 
conjectured one-wayness of the RSA function [336]. (Shamir’s gener- 
ator outputs a sequence of long strings, such that none of the string 
can be predicted from the others, except with negligible probability, 
but individual bits may be easily predictable.) Blum and Micali [72] 
proposed the criterion of next-bit unpredictability (Definition 7.15) 
and constructed a generator satisfying it based on the conjectured 
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hardness of the Discrete Logarithm Problem. Yao [421] gave the 
now-standard definition of pseudorandomness (Definition 7.3) based 
on the notion of computational indistinguishability introduced in the 
earlier work of Goldwasser and Micali [176] (which also introduced 
hybrid arguments). Yao also proved the equivalence of pseudoran- 
domness and next-bit unpredictability (Proposition 7.16), and showed 
how to construct a cryptographic pseudorandom generator from any 
one-way permutation. The construction described in Section 7.2 and 
Problems 7.12 and 7.13 uses the hardcore predicate from the later work 
of Goldreich and Levin [168]. The construction of a pseudorandom 
generator from an arbitrary one-way function (Theorem 7.11) is due 
to Hastad, Impagliazzo, Levin, and Luby [197]. The most efficient 
(and simplest) construction of pseudorandom generators from general 
one-way functions to date is in [198, 401]. Goldreich, Goldwasser, 
and Micali [164] defined and constructed pseudorandom functions, 
and illustrated their applicability in cryptography. The application of 
pseudorandom functions to learning theory is from [405], and their 
application to circuit lower bounds is from [323]. For more about 
cryptographic pseudorandom generators, pseudorandom functions, 
and their applications in cryptography, see the text by Goldreich [157]. 

Yao [421] demonstrated the applicability of pseudorandom gen- 
erators to derandomization, noting in particular that cryptographic 
pseudorandom generators imply that BPP Cc SUBEXP, and that one 
can obtain even BPP c P under stronger intractability assumptions. 
Nisan and Wigderson [302] observed that derandomization only 
requires a mildly explicit pseudorandom generator, and showed how 
to construct such generators based on the average-case hardness 
of E (Theorem 7.24). A variant of Open Problem 7.25 was posed 
in [202], who showed that it also would imply stronger results on 
hardness amplification; some partial negative results can be found 
in (214, 320]. 

The instantiation of the Nisan—Wigderson pseudorandom generator 
that uses the parity function to fool constant-depth circuits (Theo- 
rem 7.29) is from the earlier work of Nisan [298]. (The average-case 
hardness of parity against constant-depth circuits stated in Theo- 
rem 7.27 is due Boppana and Hastad [196].) The first unconditional 
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pseudorandom generator against constant-depth circuits was due to 
Ajtai and Wigderson [12] and had seed length (m) = m* (compared 
to polylog(m) in Nisan’s generator). Recently, Braverman [81] proved 
that any polylog(m)-wise independent distribution fools AC®, provid- 
ing a different way to obtain polylogarithmic seed length and resolving 
a conjecture of Linial and Nisan [265]. The notion of strong pseudo- 
random generators (a.k.a. seed-extending pseudorandom generators) 
and the average-case derandomization of AC® (Problem 7.5) are from 
[242, 355]. Superpolynomial circuit lower bounds for AC®°[2] were 
given by [322, 368]. Viola [412] constructed pseudorandom generators 
with superpolynomial stretch for AC®°[2] circuits that are restricted 
to have a logarithmic number of parity gates. 

Detailed surveys on locally decodable codes and their applications 
in theoretical computer science are given by Trevisan [391] and 
Yekhanin [424]. The notion grew out of several different lines of work, 
and it took a couple of years before a precise definition of locally decod- 
able codes was formulated. The work of Goldreich and Levin [168] on 
hardcore predicates of one-way permutations implicitly provided a local 
list-decoding algorithm for the Hadamard code. (See Problems 7.12 
and 7.13.) Working on the problem of “instance hiding” introduced 
in [2], Beaver and Feigenbaum [54] constructed a protocol based on 
Shamir’s “secret sharing” [359] that effectively amounts to using the 
local decoding algorithm for the Reed—Muller code (Algorithm 7.43) 
with the multilinear extension (Lemma 7.47). Blum, Luby, and Rubin- 
feld [71] and Lipton [266] introduced the concept of self-correctors 
for functions, which allow a one to convert a program that correctly 
computes a function on most inputs to one that correctly computes 
the function on all inputs.? Both papers gave self-correctors for group 
homomorphisms, which, when applied to homomorphisms from Z5 
to Zə, can be interpreted as a local corrector for the Hadamard code 


®°Blum, Luby, and Rubinfeld [71] also defined and constructed self-testers for functions, 
which allow one to efficiently determine whether a program does indeed compute a function 
correctly on most inputs before attempting to use self-correction. Together a self-tester and 
self-corrector yield a “program checker” in the sense of [70]. The study of self-testers gave 
rise to the notion of locally testable codes, which are intimately related to probabilistically 
checkable proofs [41, 42], and to the notion of property testing [165, 337, 340], which is an 
area within sublinear-time algorithms.) 
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(Proposition 7.40). Lipton [266] observed that the techniques of Beaver 
and Feigenbaum [54] yield a self-corrector for multivariate polynomials, 
which, as mentioned above, can be interpreted as a local corrector for 
the Reed—Muller code. Lipton pointed out that it is interesting to apply 
these self-correctors to presumably intractable functions, such as the 
Permanent (known to be #P-complete [404]), and soon it was realized 
that they could also be applied to complete problems for other classes 
by taking the multilinear extension [42]. Babai, Fortnow, Nisan, and 
Wigderson [43] used these results to construct pseudorandom genera- 
tors from the worst-case hardness of EXP (or E, due to Problem 7.2), 
and thereby obtain subexponential-time or quasipolynomial-time 
simulations of BPP under appropriate worst-case assumptions 
(Corollary 7.64, Parts 2 and 3). All of these works also used the 
terminology of random self-reducibility, which had been present in 
the cryptography literature for a while [29], and was known to imply 
worst-case/average-case connections. Understanding the relationship 
between the worst-case and average-case complexity of NP (rather 
than “high” classes like EXP) is an important area of research; see the 
survey [74]. 

Self-correctors for multivariate polynomials that can handle a 
constant fraction of errors (as in Theorem 7.42) and fraction of errors 
approaching 1/2 (as in Problem 7.7) were given by Gemmell et al. [149] 
and Gemmell and Sudan [150], respectively. Babai, Fortnow, Levin, and 
Szegedy [41] reformulated these results as providing error-correcting 
codes with efficient local decoding (and “local testing”) algorithms. 
Katz and Trevisan [239] focused attention on the exact query complex- 
ity of locally decodable codes (separately from computation time), and 
proved that locally decodable codes cannot simultaneously have the 
rate, distance, and query complexity all be constants independent of 
the message length. Constructions of 3-query locally decodable codes 
with subexponential blocklength were recently given by Yekhanin [423] 
and Efremenko [128]. Private Information Retrieval (Problem 7.6) 
was introduced by Chor, Goldreich, Kushilevitz, and Sudan [99]. 
Katz and Trevisan [239] introduced the notion of smooth codes and 
showed their close relation to both private information retrieval and 
locally decodable codes (Problem 7.6). Recently, Saraf, Kopparty, 
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and Yekhanin [249] constructed the first locally decodable codes with 
sublinear-time decoding and rate larger 1/2. 

Techniques for Hardness Amplification (namely, the Direct Product 
Theorem and XOR Lemma) were first described in oral presentations of 
Yao’s paper [421]. Since then, these results have been strengthened and 
generalized in a number of ways. See the survey [171] and Section 8.2.3. 
The first local list-decoder for Reed—Muller codes was given by Arora 
and Sudan [35] (stated in the language of program self-correctors). The 
one in Theorem 7.56 is due to Sudan, Trevisan, and Vadhan [381], who 
also gave a general definition of locally list-decodable codes (inspired 
by a list-decoding analogue of program self-correctors defined by Ar 
et al. [30]) and explicitly proved Theorems 7.60, 7.61, and 7.62. 

The result that BPP = P if E has a function of nonuniform worst- 
case hardness s(¢) = 2° (Corollary 7.64, Part 1) is from the earlier 
work of Impagliazzo and Wigderson [215], who used derandomized 
versions of the XOR Lemma to obtain sufficient average-case hardness 
for use in the Nisan—Wigderson pseudorandom generator. An optimal 
construction of pseudorandom generators from worst-case hard func- 
tions, with seed length d(m) = O(s~!(poly(m))) (cf., Theorem 7.63), 
was given by Shaltiel and Umans [356, 399]. 

For more background on AM, see the Notes and References of 
Section 2. The first evidence that AM = NP was given by Arvind 
and Kobler [37], who showed that one can use the Nisan—Wigderson 
generator with a function that is (2°,1/2 — 1/2°)-hard for non- 
deterministic circuits. Klivans and van Melkebeek [244] observed that 
the Impagliazzo—Wigderson pseudorandom generator construction 
is “black box” and used this to show that AM can be deran- 
domized using functions that are worst-case hard for circuits with 
an NP oracle (Theorem 7.68). Subsequent work showed that one 
only needs worst-case hardness against a nonuniform analogue of 
NP N co-NP (289, 356, 357]. 

Trevisan [389] showed that black-box pseudorandom generator 
constructions yield randomness extractors, and thereby obtained the 
extractor construction of Theorem 7.73. This surprising connection 
between complexity-theoretic pseudorandomness and information- 
theoretic pseudorandomness sparked much subsequent work, from 
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which the unified theory presented in this survey emerged. The fact 
that black-box hardness amplifiers are a form of locally list-decodable 
codes was explicitly stated (and used to deduce lower bounds on 
advice length) in [397]. The use of black-box constructions to classify 
and separate cryptographic primitives was pioneered by Impagliazzo 
and Rudich [213]; see also [326, 330]. 

Problem 7.1 (PRGs imply hard functions) is from [302]. Problem 7.2 
is a special case of the technique called “translation” or “padding” 
in complexity theory. Problem 7.4 (Deterministic Approximate 
Counting) is from [302]. The fastest known deterministic algorithms 
for approximately counting the number of satisfying assignments to 
a DNF formula are from [280] and [178] (depending on whether the 
approximation is relative or additive, and the magnitude of the error). 
The fact that hitting set generators imply BPP =P (Problem 7.8) 
was first proven by Andreev, Clementi, and Rolim [27]; for a more 
direct proof, see [173]. Problem 7.9 (that PRGs vs. uniform algorithms 
imply average-case derandomization) is from [216]. Goldreich [163] 
showed that PRGs are necessary for derandomization (Problem 7.10). 
The result that one-to-one one-way functions imply pseudorandom 
generators is due to Goldreich, Krawczyk, and Luby [167]; the proof 
in Problem 7.14 is from [197]. 

For more on Kolmogorov complexity, see [261]. In recent years, 
connections have been found between Kolmogorov complexity and 
derandomization; see [14]. The tighter equivalence between circuit size 
and nonuniform computation time mentioned after Definition 7.1 is due 
to Pippenger and Fischer [311]. The 5n — O(n) lower bound on circuit 
size is due to Iwama, Lachish, Morizumi, and Raz [218, 254]. The 
fact that single-sample indistinguishability against uniform algorithms 
does not imply multiple-sample indistinguishability unless we make 
additional assumptions such as efficient samplability (in contrast to 
Proposition 7.14), is due to Goldreich and Meyer [169]. (See also [172].) 


